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In  free-space  optical  communication,  the  intensity  of  a  laser  beam  is  modulated  by  a  message,  the  beam  propagates  through  free-space  or 
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is  processed  to  reconstruct  the  original  message.  The  promising  features  of  this  communication  scheme  such  as  high-bandwidth,  power 
efficiency,  and  security,  render  it  a  viable  means  for  high  data  rate  point-to-point  communication. 

In  this  dissertation,  we  adopt  a  stochastic  approach  to  address  two  major  issues  associated  with  free-space  optics:  digital  communication 
over  an  atmospheric  channel  and  maintaining  optical  alignment  between  the  transmitter  and  the  receiver,  in  spite  of  their  relative  motion. 
Associated  with  these  issues,  we  consider  several  detection,  estimation,  and  optimal  control  problems  with  point  process  observations. 
Although  these  problems  are  motivated  by  applications  in  free-space  optics,  they  are  also  of  direct  relevance  to  the  general  field  of 
estimation  theory  and  stochastic  control. 

We  study  the  detection  aspect  of  digital  communication  over  an  atmospheric  channel.  This  problem  is  formulated  as  an  M-ary  hypothesis 
testing  problem  involving  a  doubly  stochastic  marked  and  filtered  Poisson  process  in  white  Gaussian  noise.  The  formal  solutions  we  obtain 
for  this  problem  are  hard  to  express  in  an  explicit  form,  thus  we  approximate  them  by  appropriate  closed  form  expressions.  These 
approximations  can  be  implemented  using  finite-dimensional,  nonlinear,  causal  filters. 

Regarding  the  optical  alignment  issue,  we  consider  two  problems:  active  pointing  and  cooperative  optical  beam  tracking.  In  the  active 
pointing  scheme  that  we  develop  for  short  range  applications,  the  receiving  station  estimates  the  center  of  its  incident  optical  beam  based  on 
the  output  of  a  position-sensitive  photodetector.  The  transmitter  receives  this  estimate  via  an  independent  communication  link  and 
incorporates  it  to  accurately  aim  at  the  receiving  station. 

A  cooperative  optical  beam  tracking  system  consists  of  two  stations  in  such  a  manner  that  each  station  points  its  optical  beam  toward  the 
other  one.  The  stations  employ  the  arrival  direction  of  the  incident  optical  beams  as  a  guide  to  precisely  point  their  own  beam  toward  the 
other  station.  We  develop  a  detailed  stochastic  model  for  this  system  and  employ  it  to  detennine  a  control  law  which  maximizes  the  flow  of 
optical  energy  between  the  stations.  In  so  doing,  we  consider  the  effect  of  light  propagation  delay,  which  requires  a  point-ahead  mechanism 
to  compensate  for  the  displacement  of  the 
receiving  station  during  propagation  time. 
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Chapter  1 


Introduction 

Free-space  optics  is  regarded  as  a  high-bandwidth  and  power  efficient  means  for 
point-to-point  communication  with  a  wide  range  of  applications  including  fixed- 
location  terrestrial  communication  [1],  communication  between  mobile  robots  [2], 
airborne  communication  [3],  and  intersatellite  communication  [4],  In  this  mode  of 
communication,  (digital)  transmitting  data  modulates  the  instantaneous  power  of 
a  laser  beam,  which  propagates  through  free-space  or  atmosphere,  and  eventually 
strikes  the  receiver.  At  the  receiver,  an  optical  sensor  (photodetector)  converts  the 
optical  energy  into  an  electrical  signal,  which  is  processed  to  reconstruct  the  original 
data. 

Two  major  issues  are  associated  with  this  communication  scheme:  optical 
fade  caused  by  the  atmosphere  and  misalignment  of  the  stations  (transmitter  and 
receiver)  due  to  their  relative  motion.  This  research  investigates  a  stochastic  ap¬ 
proach  in  finding  solutions  to  these  problems.  Although  the  detection,  estimation, 
and  control  problems  considered  in  this  work  are  motivated  by  applications  in  free- 
space  optical  communication,  they  are  of  direct  relevance  to  the  general  field  of 


1 


estimation  theory  and  stochastic  control. 

Throughout  this  chapter,  we  briefly  explain  these  issues,  the  models  adopted 
or  developed  to  describe  them,  and  our  solutions  to  the  associated  problems.  In  the 
last  section,  we  fix  the  notation  that  will  be  used  in  the  following  chapters. 

1.1  Atmospheric  Turbulent  Channels 

The  atmosphere,  as  an  optical  medium,  introduces  random  fluctuations  in  the  power 
of  the  propagating  optical  field.  The  atmospheric  turbulence  caused  by  differential 
heating  of  the  air  is  characterized  in  terms  of  a  (slowly-varying  lognormal)  random 
process  which  modulates  the  optical  power  at  the  receiver.  These  random  fluctu¬ 
ations  (fade)  are  a  characteristic  feature  of  atmospheric  channels  in  contrast  with 
conventional  fiber  optic  channels. 

In  general,  the  output  of  an  optical  sensor  can  be  modeled  by  a  marked  and 
filtered  Poisson  process,  which  is  a  stream  of  randomly  weighted  narrow  pulses 
arriving  at  the  jump  times  of  a  Poisson  process  [5].  Also,  the  electronic  circuit 
which  follows  the  sensor,  corrupts  this  signal  by  thermal  noise  which  is  modeled  by 
an  additive  white  Gaussian  process.  Thus,  the  problem  of  detecting  digital  signals 
over  an  atmospheric  optical  channel  can  be  formulated  as  a  M-ary  hypothesis  testing 
problem  with  observations  which  consists  of  a  doubly  stochastic  marked  and  filtered 
Poisson  process  in  additive  white  Gaussian  noise. 

While  most  prior  studies  of  the  detection  problem  above  are  based  on  some 
simplified  version  of  the  previous  model  [6,  7,  8,  9,  10]  or  involve  a  linearity  con¬ 
straint  on  the  detector  structure  [11,  12,  13,  14,  15],  a  few  tackle  the  problem  in  its 
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general  form  [5,  16,  17,  18].  A  state-space  approach  developed  in  [5,  16]  succeeds  to 
formulate  a  solution  in  terms  of  a  stochastic  partial  differential-difference-integral 
equation;  however,  the  solution  of  this  equation  must  be  approximated  using  a 
finite- dimensional  filter  with  an  unclear  approximation  error. 

We  adopt  another  approach  following  that  in  [17,  18]  h  This  approach  is  based 
on  the  fact  that  conditioned  on  the  number  and  arrival  times  of  the  pulses,  the 
problem  reduces  to  one  of  M-ary  detection  of  a  deterministic  signal  in  white  Gaussian 
noise,  which  has  a  known  solution.  Then,  averaging  over  the  number  and  arrival 
times  of  the  Poisson  process,  the  solution  to  the  original  problem  can  be  obtained. 
This  leads  to  an  expression  involving  an  infinite  sum  of  multiple  integrals,  which 
is  hard  to  reduce  to  an  explicit  (closed  form)  expression;  however,  under  certain 
assumptions,  this  infinite  sum  can  be  approximated  by  an  explicit  formula  or  a 
mathematically  tractable  equation. 

In  Chapter  2,  after  presenting  the  model  of  an  atmospheric  optical  channel 
and  stating  its  associated  detection  problem,  we  discuss  our  approach  in  more  detail. 
Based  on  the  infinite  sum  mentioned  above,  we  establish  mathematically  tractable 
upper  and  lower  bounds  on  the  exact  solution  and  study  the  behavior  of  these 
bounds  for  some  important  limiting  cases.  We  show  that  in  these  limiting  cases, 
the  lower  bound  tends  to  the  upper  bound.  This  motivates  us  to  approximate 
the  exact  solution  with  the  upper  (or  lower)  bound  under  conditions  which  closely 
approximate  the  limiting  cases. 

1In  [17,  18],  the  avalanche  gain  of  the  optical  sensor  and  the  turbulent  fade  are  not  considered. 
The  later  is  essential  for  an  atmospheric  channel. 


3 


Furthermore,  in  Chapter  2,  we  introduce  a  novel  technique  for  expressing  the 
infinite  sum  in  terms  of  an  expectation  taken  with  respect  to  a  stochastic  process. 
This  new  expression  is  then  used  in  order  to  develop  several  approximate  solutions. 
We  remark  here  that  the  results  of  Chapter  2  are  directly  applicable  to  the  fiber 
optic  channels  as  a  special  case  of  the  atmospheric  turbulent  channels. 

1.2  Optical  Alignment 

A  major  challenge  in  free-space  optical  communication  is  to  maintain  optical  align¬ 
ment  between  the  stations  despite  their  relative  motion.  This  relative  motion,  caused 
by  the  mobility  of  the  stations  or  mechanical  vibration,  can  be  comparable  in  mag¬ 
nitude  to  the  size  of  the  narrow  laser  beams  employed  by  the  optical  link.  Therefore, 
a  closed-loop  fine  alignment  mechanism  is  required  to  maintain  the  alignment  after 
the  link  is  established  through  a  coarse  open-loop  alignment  operation  referred  to 
as  spatial  acquisition. 

The  closed-loop  fine  alignment  can  be  decomposed  into  two  operations:  active 
pointing  and  optical  beam  tracking.  The  goal  of  the  first  operation  is  to  aim  the 
transmitted  beam  toward  the  receiver  within  an  acceptable  accuracy,  while  the 
second  operation  is  intended  to  maintain  the  transmitter  within  the  held  of  view  of 
the  receiver.  For  the  purpose  of  alignment  (pointing  and  tracking),  the  receiving  and 
transmitting  optical  devices  are  installed  on  electromechanical  pointing  assemblies1 , 
which  adjust  the  direction  of  the  devices  according  to  control  signals  generated  by 

1  Alternatively,  the  stations  can  be  equipped  with  steerable  flat  mirrors  to  control  the  light 
direction. 
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appropriate  closed-loop  controllers. 

The  closed-loop  controller  employed  by  an  optical  alignment  system  is  usually 
fed  by  the  output  of  a  position-sensitive  photodetector  (e.g.  quadrant  detector).  The 
output  of  this  device  is  normally  modeled  by  a  vector-valued  point  process  or  ideally 
by  a  space-time  point  process.  Thus,  the  control  aspect  of  an  optical  alignment 
system  can  be  formulated  in  terms  of  a  stochastic  optimal  control  problem  with 
point  process  observations.  In  the  remainder  of  this  section,  we  briefly  explain  the 
operations  of  optical  beam  tracking  and  active  pointing,  and  onr  stochastic  approach 
for  a  resolution  of  the  associated  problems. 

1.2.1  Optical  Beam  Tracking 

In  free-space  optics,  optical  beam  tracking  is  an  active  operation  with  the  goal 
of  keeping  the  transmitter  in  the  field  of  view  of  the  receiver.  Figure  1.1  illus¬ 
trates  a  simple  optical  receiver  employed  in  a  free-space  optical  link.  The  receiver 


Figure  1.1:  Schematic  diagram  of  a  simple  optical  receiver. 
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is  equipped  with  a  lens  (or  curved  mirror)  to  focus  the  incident  optical  field  on 
a  position-sensitive  photodetector.  The  position-sensitive  photodetector  is  a  pho¬ 
toelectron  converter  whose  surface  is  partitioned  into  small  regions  (pixels)  with 
independent  outputs.  The  image  of  the  incident  optical  field  is  a  spot  of  light  which 
randomly  moves  on  the  surface  of  the  photodetector  due  to  the  relative  motion  of  the 
transmitter  and  the  receiver.  In  the  absence  of  an  adjusting  mechanism,  the  effect 
of  the  relative  motion  might  be  large  enough  to  move  the  spot  of  light  beyond  the 
surface  of  the  photodetector.  An  active  mechanism  is  needed  to  maintain  the  spot 
of  light  at  the  center  of  the  photodetector  by  consistently  adjusting  the  direction 
of  the  receiver.  For  this  purpose,  the  receiver  is  installed  on  an  electromechanical 
pointing  assembly  which  controls  the  direction  of  the  receiver  (in  azimuth  and  el¬ 
evation),  based  on  the  control  signals  generated  by  a  closed- loop  controller.  The 
closed-loop  controller  estimates  the  location  of  the  spot  of  light  from  the  output 
of  the  position-sensitive  photodetector  and  determines  a  proper  control  in  order  to 
direct  the  spot  of  light  toward  the  center  of  the  photodetector. 

The  system  explained  above  has  been  modeled  by  a  linear  state-space  equation 
which  is  driven  by  a  control  vector  and  a  vector- valued  Wiener  process  [19].  Under 
the  assumption  that  the  photodetector  has  an  infinite  spatial  resolution,  the  output 
of  the  photodetector  has  been  described  by  a  space-time  point  process  whose  rate  is 
modulated  by  the  state  of  the  system  [19].  Also,  the  goal  of  closed-loop  control  can 
be  formulated  in  terms  of  minimizing  a  quadratic  cost  functional  [19].  Assuming 
that  the  observation  of  the  space-time  point  process  is  provided  on  M2  (which  is 
practically  justified)  and  that  the  rate  of  this  process  has  a  Gaussian  profile,  the 
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solution  to  this  optimal  control  problem  (and  its  associated  estimation  problem)  is 
finite- dimensional  and  has  been  obtained  in  [20]. 

In  Chapter  3,  we  discuss  this  problem  in  more  detail  and  attempt  to  relax 
the  assumption  of  a  “Gaussian  profile”.  This  leads  us  to  reformulate  the  state 
estimation  problem  in  terms  of  estimating  the  state  of  a  discrete-time  linear  model 
with  additive  white  non-Gaussian  measurement  noise. 

For  a  practical  system  with  a  finite  resolution  photodetector,  the  estimation 
and  control  problems  above  are  infinite-dimensional,  thus  some  sort  of  approxima¬ 
tion  is  required  to  solve  them.  A  possible  approach  to  this  approximation  problem, 
specially  for  a  high  resolution  photodetector,  is  to  modify  the  results  of  [20]  for  a 
finite  resolution  photodetector1.  This  approach  is  motivated  by  the  fact  that  the 
results  of  [20]  are  exact  and  are  expressed  in  an  explicit  form. 

1.2.2  Active  Pointing 

For  short  range  applications,  in  which  the  size  of  the  receiving  aperture  is  comparable 
to  the  size  of  the  optical  beam,  we  develop  an  active  pointing  scheme  in  Chapter  4. 
In  this  scheme,  the  receiver  is  equipped  with  a  position-sensitive  photodetector  in 
order  to  measure  the  intensity  profile  of  its  incident  optical  beam.  The  output 
of  the  photodetector  is  used  to  estimate  the  center  of  the  received  optical  beam, 
whereupon  the  estimate  is  conveyed  to  the  transmitter  through  an  optical  link  or 
a  low-bandwidth  RF  channel.  Based  on  this  estimate,  a  pointing  assembly  adjusts 

1  The  alternative  approach  is  to  start  from  the  stochastic  partial  differential  equation  which 
describes  the  temporal  evolution  of  the  posterior  density  of  the  state  vector  and  try  to  approximate 
its  solution  using  a  finite-dimensional  filter. 
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the  transmitter  direction  with  the  goal  of  maintaining  the  center  of  the  optical  beam 
close  to  the  center  of  the  receiving  aperture.  Note  that  the  pointing  direction  rnnst 
be  adjusted  consistently  in  order  to  compensate  for  the  relative  motion  between  the 
transmitter  and  the  receiver.  The  block  diagram  in  Figure  1.2  illustrates  this  active 
pointing  scheme. 


Relative  Motion  Relative  Motion 


RF  or  Optical  Link 

Figure  1.2:  Active  pointing  scheme  for  a  short  range  free-space  optical  channel. 


We  model  the  dynamics  of  the  pointing  assembly  and  the  relative  motion  by 
a  stochastic  linear  state-space  equation.  The  observation  of  the  optical  intensity 
on  the  receiving  aperture  (photodetector  output)  will  be  modeled  by  a  space-time 
point  process  with  a  rate  depending  on  the  state  of  the  dynamical  model.  Also, 
we  formulate  the  pointing  control  problem  in  terms  of  seeking  a  control  law  that 
minimizes  a  quadratic  cost  functional  of  the  state  and  the  control  vectors. 

We  note  that  for  active  pointing,  the  observations  are  provided  only  over  a 
subset  of  M2,  in  contrast  to  the  optical  beam  tracking  in  which  the  observation  is 
given  over  M2.  This  causes  significant  technical  difficulties,  since  the  solution  to  the 
estimation  and  control  problems  associated  with  this  model  are  infinite-dimensional. 
For  this  reason,  instead  of  an  exact  solution,  we  obtain  an  approximate  estimator 


and  controller  which  are  asymptotically  optimal  in  the  sense  that  they  tend  to  the 
optimal  estimator  and  controller,  as  the  aperture  tends  to  R2. 

1.2.3  Cooperative  Optical  Beam  Tracking 

Cooperative  optical  beam  tracking  is  a  viable  solution  to  the  alignment  problem, 
especially  for  long  range  free-space  optical  communication.  An  optical  link  which 
incorporates  this  alignment  scheme  consists  of  two  stations  in  such  a  manner  that 
each  station  points  its  optical  beam  toward  the  other  one.  The  stations  employ 
the  arrival  direction  of  the  incident  optical  beams  as  a  guide  to  precisely  point 
their  own  beam  toward  the  other  station.  In  short  range  applications  in  which  the 
light  propagation  delay  is  negligible,  the  transmitter  points  its  optical  beam  along 
the  arrival  direction,  while  in  long  range  applications  with  significant  propagation 
delay  (e.  g.,  intersatellite  communication),  a  point-ahead  mechanism  compensates 
for  the  displacement  of  the  receiving  station  during  propagation  time.  The  concept 
of  “cooperative  optical  beam  tracking”  will  be  explained  in  detail  in  Chapter  5,  with 
reference  to  the  architecture  of  the  optical  transceivers  employed  in  this  alignment 
method. 

The  model  we  develop  in  Chapter  5  for  this  alignment  scheme  consists  of  two 
dynamically  coupled  subsystems,  such  that  each  subsystem  is  modeled  similar  to  an 
optical  beam  tracking  system.  In  developing  the  model,  we  incorporate  nonlinear 
effects,  major  sources  of  disturbance,  light  propagation  delay,  and  the  fluctuations 
of  the  optical  intensity  due  to  the  modulation  of  data  and  optical  fade.  We  believe 
that  including  these  details  in  the  modeling  procedure  leads  to  a  fairly  accurate 
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model.  For  a  special  and  important  case  in  which  the  relative  motion  can  be  de¬ 
composed  into  a  predetermined  large  component  and  an  unknown  small  component 
(e.g.  intersatellite  applications),  the  nonlinear  dynamics  of  the  system  will  be  lin¬ 
earized  around  a  nominal  state  trajectory.  This  linearized  model1  will  be  used  later 
in  Chapter  6. 

While  cooperative  optical  beam  tracking  has  been  already  analyzed  using  sim¬ 
ple  deterministic  models  [21,  22],  we  shall  use  a  stochastic  approach  in  Chapter  6  in 
order  to  study  this  alignment  scheme.  The  design  goal  is  to  obtain  two  controllers 
(one  for  each  station)  to  maximize  an  objective  functional  which  is  defined  as  the 
expected  flow  of  optical  energy  between  the  stations.  Note  that  this  control  prob¬ 
lem  is  decentralized  in  the  sense  that  each  station  has  access  only  to  its  own  local 
observation.  For  a  negligible  propagation  delay,  we  directly  maximize  the  objective 
functional,  while  for  the  case  of  a  significant  propagation  delay,  we  maximize  a  lower 
bound  on  the  objective  functional. 

1.3  Notations 

In  the  following  chapters,  dependence  on  time  will  be  displayed  by  subscript  t,  e.g., 
a  stochastic  process  or  a  deterministic  signal  will  be  denoted  as  (-)t.  This  convention 
will  be  occasionally  violated  in  Chapter  2  by  using  (•)  (t)  to  show  the  dependence 
on  time  or  using  subscripts  i  through  n  in  order  to  index  a  variable  or  a  time-signal 
over  the  integer  set.  In  the  later  case,  a  time-signal  will  be  denoted  by  (-)t  ( t ).  All 
matrices  will  be  denoted  by  capital  letters  and  we  shall  occasionally  use  capital 
^Note  that  only  the  dynamical  equations  are  linearized,  not  the  observation  model. 
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letters  for  vectors  or  scalars.  We  shall  not  use  distinct  notation  to  differentiate 


between  deterministic  versus  stochastic  or  vector  versus  scalar  quantities;  thus,  the 
nature  of  an  entity  should  be  understood  from  its  context. 
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Chapter  2 


Nonlinear  Detection  for  Digital  Communication  Over 
Optical  Channels 

2.1  Introduction 

The  receiving  end  of  any  optical  communication  link  (fiber  or  free-space)  is  equipped 
with  one  of  the  several  types  of  photodetectors  (photoemissive,  photovoltaic,  or  pho- 
toconductive)  to  convert  the  received  optical  power  into  an  electrical  signal.  The 
output  of  a  photodetector,  regardless  of  its  type,  is  a  stream  of  narrow  pulses  which 
occur  with  a  rate  depending  on  the  instantaneous  optical  power  striking  the  surface 
of  the  device  [5].  Each  pulse  of  this  stream  corresponds  to  an  electron  generated 
through  a  photo-electron  conversion.  Avalanche  photodiodes  and  photomultiplier 
tubes  are  designed  in  such  a  manner  that  each  photo-generated  charge  carrier  re¬ 
leases  additional  charge  carriers  [23].  This  mechanism  introduces  an  internal  gain 
modeled  by  i.i.d.  random  variables  which  multiply  the  amplitude  of  the  pulses  [5]. 
In  accordance  with  the  description  above,  in  the  most  general  case,  the  output  of  a 
photodetector  is  modeled  by  a  marked  and  filtered  Poisson  process  [24]  whose  rate 
is  modulated  by  the  incident  optical  power. 
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Normally,  the  output  of  the  photodetector  is  degraded  by  the  thermal  noise 
generated  by  the  internal  photodetector  resistance,  the  amplifier  circuit,  and  the 
load  resistor  [5].  The  thermal  noise  is  well- modeled  by  an  additive,  zero- mean, 
white  Gaussian  noise. 

In  free-space  optical  channels,  atmospheric  turbulence  significantly  perturbs 
the  optical  power  at  the  receiving  end  of  the  link.  Mathematically,  this  phenomenon 
is  characterized  in  terms  of  a  (lognormal)  random  process  which  modulates  the 
optical  power  at  the  receiver  [25,  26].  In  this  case,  the  channel  output  must  be 
modeled  as  a  doubly  stochastic  marked  and  filtered  Poisson  process  [24]  in  additive 
white  Gaussian  noise. 

In  order  to  transmit  a  single  symbol  through  a  digital  optical  communication 
link,  a  waveform  associated  with  the  symbol  is  picked  from  a  set  of  predetermined 
waveforms  to  modulate  the  power  of  the  transmitting  optical  source  during  a  symbol 
transmission  time.  At  the  receiving  end,  based  on  the  channel  output  during  the 
symbol  transmission  time,  a  detector  decides  which  symbol  was  transmitted,  in  such 
a  manner  that  the  probability  of  erroneous  decision  is  minimized.  The  structure  and 
design  of  such  a  detector  is  the  subject  of  this  chapter. 

During  the  last  four  decades,  the  detection  problem  above  has  been  studied 
with  different  levels  of  model  complexity.  An  idealized  model  which  assumes  infinite 
bandwidth,  constant  avalanche  gain,  and  zero  thermal  noise  for  a  photodetector, 
is  provided  by  a  (doubly  stochastic)  Poisson  process.  Using  this  simplified  model, 
binary  and  M-ary  hypothesis  testing  problems  have  been  addressed  in  [6,  7,  8,  9,  10]. 
As  a  suboptimal  solution  to  this  hypothesis  testing  problem,  some  authors  have 
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proposed  a  linear  detector  [11,  12,  13,  14,  15].  Gardner  [11]  introduced  an  equivalent 
linear  model  for  a  marked  and  filtered  Poisson  process  and  employed  this  model  to 
develop  a  linear  detector.  In  [12,  13,  14,  15],  the  linear  detectors  are  designed  based 
on  minimizing  the  Chernoff  bound  for  the  probability  of  error. 

Hoversten  et  al.  [5],  Kurimoto  [16],  and  Bhanji  [17]  tackled  the  problem  by 
exploiting  a  general  formula  due  to  Duncan  [27]  and  Kailath  [28]  for  the  likelihood 
ratio  function  of  a  stochastic  process  in  white  Gaussian  noise.  They  used  a  state- 
space  approach  to  develop  an  approximate  estimator  which  contributes  to  the  fto 
integral  based  estimator-correlator  structure  of  the  likelihood  ratio  function. 

We  approach  the  problem  using  the  fact  that  conditioned  on  the  Poisson  point 
process  associated  with  the  photodetector  output,  the  problem  reduces  to  one  of  Al¬ 
ary  detection  of  a  deterministic  signal  in  white  Gaussian  noise,  which  has  a  known 
solution.  Then,  averaging  over  this  point  process,  we  obtain  the  solution  to  the 
original  problem.  This  procedure  leads  to  an  expression  involving  an  infinite  sum  of 
multiple  integrals,  which  seems  impossible  to  solve  for  an  explicit  expression.  The 
method  explained  above  has  been  already  applied  to  a  special  case  of  the  problem 
by  Bhanji  [17]  and  Hero  [18],  albeit  with  limited  results. 

In  order  to  derive  useful  results  from  the  mentioned  infinite  sum,  we  follow 
two  different  directions.  In  the  first  direction,  we  establish  upper  and  lower  bounds 
on  the  infinite  sum  in  terms  of  two  integral  equations.  Then,  we  show  that  the  lower 
bound  approaches  the  upper  bound,  as  the  pulse  duration  tends  to  zero.  Based  on 
this  fact  and  for  a  small  pulse  duration,  we  approximate  the  infinite  sum  by  the 
solution  of  an  integral  equation. 
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In  the  second  direction,  we  introduce  a  new  technique  to  rewrite  the  infinite 
sum  in  terms  of  an  expectation  taken  with  respect  to  a  stochastic  process.  This 
formula  which  is  suitable  for  further  approximations,  will  be  the  point  of  depar¬ 
ture  for  developing  several  approximate  detectors,  obtained  under  different  set  of 
assumptions. 

2.2  The  Model  and  Problem  Statement 

We  consider  an  optical  channel  in  which  a  nonnegative  input  signal  st  modulates 
the  power  of  an  optical  source  at  the  transmitter.  The  optical  signal  strikes  a 
photodetector  at  the  receiver,  after  propagating  through  an  optical  medium,  which 
in  addition  to  attenuation,  introduces  random  fluctuations  in  the  optical  power 
(when  the  atmosphere  is  the  propagation  medium).  For  the  purpose  of  amplification, 
the  photodetector  is  followed  by  an  electronic  circuit,  which  gives  rise  to  corrupting 
thermal  noise.  The  output  of  this  circuit  is  regarded  as  the  channel  output  yt. 

The  model  we  use  for  an  optical  channel  (with  some  modification)  is  adopted 
from  [5,  29].  In  order  to  keep  the  description  self-contained,  we  reproduce  the  model 
here.  We  first  summarize  the  model  and  then  discuss  in  more  detail  the  physical 
significance  of  the  model  parameters. 

2.2.1  Stochastic  Model  of  an  Optical  Link 

Let  {Nt,  t  ^  0}  be  a  doubly  stochastic  Poisson  process  with  jump  times  {tn}^X 1 
and  a  stochastic  rate 

Xt  =  ast  +  (2-1) 
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where  a  is  a  nonnegative  random  variable  with  a  known  probability  density  function, 
{si}  is  a  nonnegative  stochastic  process  regarded  as  the  channel  input,  and  /i  is  a 
known  nonnegative  constant.  Let  {qn}n li  be  an  i.i.d.  sequence  of  random  variables 
with  a  known  cumulative  distribution  function.  Denote  by  {tty}  a  standard  Wiener 
process  and  let  a  to  be  a  known  constant.  It  is  assumed  that  {tn}nLv  {Qn}n li>  T 
and  {wt}  are  statistically  independent.  Moreover,  {st}  is  statistically  independent 
of  {qn}n La  ai  and  {u>t}.  Suppose  that  tt  (•)  is  a  unit  area1  deterministic  function 
such  that  7T  (t)  =0  for  t  <  0.  Then,  the  channel  output  yt  can  be  modeled  [5]  by 
the  stochastic  differential  equation2 

Nt 

dyt  =  ^  qn 7T  (t  -  tn )  dt  +  adwt.  (2.2) 

n= 1 

We  note  that  a  doubly  stochastic  Poisson  point  process  is  fully  characterized  [24]  by 
Pr  {Nf  =  0 1 A^,  t  E  [0,  t]}  =  exp  J  X Tdr^j  (2.3) 

and  for  n  =  1,  2,  3, . . ., 


Pr  {Nt  =  n,  U  E  [t*,  t*  +  dr*); 


1,  2, . . . ,  n\\T,  tE  [0,  t]} 


Q  A ndTi 


exp 


\i= 1 


(2.4) 


when  0  ^  T\  ^  r2  ^  ^  rn  ^  t,  and  0,  otherwise. 


Wliis  means  that  /0°°  n  (t)  dt  =  1. 
2We  take  ^°=1  (')  equal  0. 
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2.2.2  Significance  of  the  Model  Parameters 


The  nonnegative  constant  //  in  (2.1)  represents  the  sum  of  the  photodetector  dark 
current  rate  and  the  rate  associated  with  the  background  radiation  in  the  case 
of  free-space  channels  [29].  Except  for  the  atmospheric  channel,  the  parameter  a 
in  (2.1)  is  a  known  constant  which  characterizes  the  multiplicative  combination 
of  the  antenna  gain,  the  path  attenuation,  and  the  photodetector  sensitivity.  For 
atmospheric  channels,  a  is  modeled  by  a  nonnegative  random  variable  to  reflect 
the  random  fluctuations  of  the  optical  power  caused  by  atmospheric  turbulence. 
When  the  receiving  aperture  is  smaller  than  the  turbulence  coherence  length1,  a  is 
a  lognormal  random  variable  [25]  defined  as  a  =  be2x,  where  d  =  E  [a]  and  y  is  a 
normal  random  variable  with  mean  —a2  and  variance  <x2  [26].  Here,  cr2  is  a  known 
constant  depending  on  the  wavelength  of  the  light,  the  propagation  distance,  the 
refractive-index  structure  constant,  and  the  shape  of  the  optical  field  [30].  Note  that 
the  turbulent  fade  is  a  time- varying  phenomenon;  however,  since  its  coherence  time 
is  much  longer  than  the  transmission  interval  [25],  it  can  be  accurately  modeled  as  a 
random  variable  during  the  transmission  of  a  single  message.  We  remark  that  when 
the  receiver  possesses  the  perfect  information  of  the  channel  fade,  a  can  be  modeled 
as  a  constant.  Also,  when  imperfect  information  of  the  channel  fade  is  provided  to 
the  receiver  as  an  estimate  for  a,  the  distribution  of  a  must  be  modified  accordingly. 

The  integer-valued  i.i.d.  random  variables  {qn}n Lp  stand  for  the  random 
avalanche  gains,  i.e.,  the  number  of  released  charge  carriers  due  to  a  single  photo- 

1For  definition  of  the  turbulence  coherence  length  see  [25,  30]. 
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generated  charge  carrier.  The  probability  distribution  of  qn  has  been  derived  (the¬ 
oretically)  by  McIntyre  [31]  and  verified  experimentally  by  Conradi  [32], 

The  area  under  7r(-)  is  equal  to  the  charge  of  an  electron  multiplied  by  the 
gain  of  the  amplifier  which  follows  the  photodetector.  For  sake  of  simplicity,  we 
normalize  this  quantity  to  1.  The  shape  of  pulses  at  the  output  of  a  practical 
photodetector  vary  from  one  pulse  to  another,  i.e.,  the  pulse  shape  is  a  random 
function  [33];  however,  since  a  reasonable  stochastic  model  for  the  pulse  shape  is 
not  available,  7r(-)  is  characterized  by  the  average  of  the  random  pulses.  For  an 
avalanche  photodiode,  this  averaged  pulse  shape  is  derived  in  [33,  34,  35]. 

The  standard  Wiener  process  {wy}  hi  (2.2)  represents  the  thermal  noise  gener¬ 
ated  by  the  amplifier  which  follows  the  photodetector.  The  strength  of  the  thermal 
noise  is  characterized  by  the  known  constant  a2. 

2.2.3  Problem  Statement 

Suppose  that  a  random  message  m  taken  from  {1,2, .. . ,  M}  is  to  be  transmitted 
through  an  optical  channel  during  [0,  T].  Here,  the  transmission  time  T  >  0  is  a 
known  constant.  Denote  by  pm ,  m  —  1,2,...,  M,  the  prior  probability  of  message  m. 
Assume  that  a  deterministic,  nonnegative,  bounded  waveform  sm  ( t )  is  assigned  to 
each  message  m  =  1,2, .. . ,  M .  Then,  in  order  to  transmit  m,  we  let  s*  =  sm  (t) 
during  t  G  [0,  T\. 

Let  (fl,  P)  be  the  underlying  probability  space  for  the  stochastic  model 
of  Section  2.2.1.  On  this  probability  space,  we  define  as  the  a- algebra  gen¬ 
erated  by  {yt}  during  [0,  T\.  The  goal  is  to  obtain  a  ^-measurable  detection 
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rule  m(&r)  £  {1,2 which  minimizes  the  probability  of  error  defined  as 
Pe  =  Pr  {m  (&t)  ~f~  m\- 

2.3  The  Optimal  Detection  Law 

In  this  section,  we  consider  the  hypothesis  testing  problem  of  Section  2.2.3  and 
determine  a  formal  solution  for  it  via  Lemma  2.3.1  below.  Then,  through  Theo¬ 
rems  2.3.1,  2.3.2,  and  2.3.3,  three  different  expressions  will  be  presented  for  this 
solution. 

Lemma  2.3.1.  For  i  =  1,2, ,  M,  consider  the  binary  hypothesis  testing  problem 

Nt 

Hi  :  dyt  =  ^2  ^  (t  ~  tn)  dt  +  adwt,  \t  =  as.i  (t)  +  n,  te  [0,  T] 

"=i  (2.5) 

H0  :  dyt  =  adwt,  t  e  [0,  T] 

and  let  Li  (T)  denote  its  associated  likelihood  ratio  function  given  '3'f.  Then,  the  op¬ 
timal  detection  rule  which  minimizes  the  probability  of  error  Pe  =  Pr  {m  (@f)  ^  m} 
is  given  by  the  maximum  a  posteriori  estimator 

m  (@t)  =  arg  ma x^p^Lj  (T) .  (2.6) 

Proof.  Define  5(i,j )  such  that  6(i,i)  =  1  and  5(i,j)  =  0  for  i  ^  j.  Then,  the 
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probability  of  error  Pe  can  be  expressed  as 


Pe  =  E  [1  —  5  (m,  rh) 


=  1-E[E[5  (m,  rh  (@t))  \¥t\ 


=  1-E 


M 


y]  5  (i,  rh  (&t))  Pr  {m  =  i\^r} 


i=  1 


By  the  chain  rule  for  likelihood  ratios  [36],  we  can  write 


Pr{m  =  i\&T}/Pi  U(T ) 
Pr  {m  —  j  |  ^t}  / Pj  Lj  (T) 


which  leads  to 


Pe  =  1 


E 


Y,iiiS(h™(&r))PiLi  (T) 

E£i  pMT) 


In  order  to  minimize  Pe,  the  sum  in  the  numerator  of  the  expression  above  must  be 
maximized,  which  results  in  the  detection  rule  (2.6).  □ 


Corollary  2.3.1.  The  solution  for  the  binary  case  (M  =  2)  of  the  hypothesis  testing 
problem  in  Section  2.2.3  is  the  threshold  test 


{2  if  Lb  (' T )  ^  f2 
1  if  Lk(r)<a 

where  the  likelihood  ratio  function  Lb  ( T )  is  given  by 


(2.7) 


Lb  (T) 


MU 

Li  (■ T ) 
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Proof.  The  proof  follows  directly  from  (2.6). 


□ 


Our  goal  for  the  remainder  of  this  section  is  to  obtain  proper  expressions 
for  Lj  (T) ,  i  =  1,  2, ... ,  M.  Theorem  2.3.1  below  represents  L,  (T)  in  terms  of  an 
infinite  sum  of  multiple  integrals. 

Theorem  2.3.1.  Fix  a  and  define  A i  (t)  =  aSi  (t)  +  /i.  Then,  the  likelihood  ratio 
function  L*  (T)  can  be  expressed  as 

Li(T)  =  A(&r;{\i(t)},T)  (2.8) 


where,  for  any  deterministic  function  At,  the  functional  A  (•)  is  given  by 


A  (^r;  {At}  ,  T)  =  exp  J  \tdtj  •  f  1  + 


00  i  pT  pT  /»+ oo  /»+o o 


'  0  J —oo  J —oo 


aJ°  (n,  ...,rn,  91,...,9n)  Y[dPq(9k)  JjATfcdrfc  .  (2.9) 


Here,  A$C  (n, . . . ,  t„,  9i,  ...  ,9n)  is  dehned  as 


A? }  (ti,  . . . ,  rn,  6>!, . . . ,  9n)  =  exp  i  /  V'  6>fc7r  (t  -  rk)  dyt 

l a  ^  0  fe=i 

~2 Jq  (]C0fc7r(i'^T*))  (2-10) 


and  Pq  (•)  is  the  common  cumulative  distribution  function  of  {gn}^=1. 

Proof.  Consider  the  random  vector  VA  =  (H, .  •  • ,  CvT,  <? i,  •  •  • ,  ?jvt)-  Conditioned 
on  Vt,  the  hypothesis  testing  problem  (2.5)  is  one  of  detecting  a  deterministic 
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signal  in  white  Gaussian  noise.  For  the  realization  (r1; . . . ,  rn,  9i,  .  .  .  ,  9n)  of  Vp, 
the  likelihood  ratio  function  associated  with  this  hypothesis  testing  problem  is 
given  by  (2.10).  The  likelihood  ratio  function  for  the  original  problem  can  be  ob¬ 
tained  [36]  by  averaging  (2.10)  over  all  realizations  of  Vp,  where  the  rate  associated 
with  Vp  is  A i  ( t )  =  asi  (t)  +  ft.  Thus,  we  can  write 


Li(T) 


E 


AJVt) 


■  ,tNT,Qi,  ■  ■  ■  ,Qnt) 


(2.11) 


Using  (2.3)  and  (2.4),  it  is  easy  to  verify  that  this  expectation  can  be  written  as  (2.8) 
with 

(pT  \  /  °°  j-T  pTn  r*T2  p+oo  p- |-oo 

-  /  A  tdt )  •  1  +  /  /  '/  /  '/ 

Jo  /  \  n=l  Jo  Jo  Jo  j - oo  J-o o 

n  n  \ 

A?}  (u,  •  •  • ,  rn,  6»i, . . . ,  6»n)  dPq  (6k)  JJ  A Tkdrk  J .  (2.12) 

k= 1  fc=l  / 

Since  the  integrand  of  each  multiple  integral  is  invariant  under  any  change  in  the 
order  of  Tk  s,  (2.12)  can  be  rewritten  as  (2.9).  □ 

Corollary  2.3.2.  For  the  case  of  a  random  a,  the  likelihood  ratio  function  L*  (T) 
can  be  expressed  as 

Ll(T)  =  Aa(&T]{si(t)},T)  (2.13) 

where,  for  any  deterministic  function  st,  the  functional  Aa  (•)  is  given  by 


Aa  (^T;{st},T) 


A  (%;  {ast  +  yU.}  ,  T)  dPa  (a) . 


(2.14) 
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Here,  Pa  (•)  denotes  the  cumulative  distribution  function  of  a. 

Proof.  Applying  the  law  of  total  expectation  to  (2.11)  and  using  the  fact  that  a 
and  are  statistically  independent,  we  get 

Li(T)  =  E  [E  [a{tNt)  qNr )  |^T,  a]  \&T 

=  E[A  (t)  +  n},T)\&r\ 

A  (‘3'f]  {o-Sj  (t)  +  fx}  ,T)  d.Pa  (a) . 

This  leads  to  (2.13)  with  Aa  (•)  defined  by  (2.14).  □ 

Remark  2.3.1.  We  observe  from  (2.9)  and  (2.10)  that  the  dependence  of  the  func¬ 
tional  A  [f3^T\  {At}  ,  T )  on  {yt}  is  through  the  stochastic  process  {zt}  defined  as 

zt  =  \\  7t(t  -t)dyT,  t  e  [0,T],  (2.15) 

o'  Jo 

This  implies  that  for  implementing  A  (^;  {At}  ,  T),  the  first  stage  is  a  linear  filter 
characterized  by  (2.15).  We  remark  that  (2.15)  is  a  matched  filter  for  vr(-). 

The  following  theorem  introduces  a  new  technique  to  rewrite  (2.9)  in  a  simpler 
form,  which  is  suitable  for  the  purpose  of  approximation. 

Theorem  2.3.2.  Let  {£t}  be  a  standard  Wiener  process  independent  of  {tn}^f  1, 
{qn}n li,  and  {wy},  which  is  defined  on  the  probability  space  (Q,^,P).  Fix  a  and 
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define  the  stochastic  process 


£t  =  ~  [  —  t)dfT,  t  e  [0,T], 


a 


(2.16) 


Then,  with  probability  1,  the  functional  A  (•)  can  be  expressed  as 


A  (Wr;  {At}  ,  T)  =  E  exp  ^ jT  \t$q  (zt  +  j£t)  dtj 


^Vt 


(2.17) 


where  zt  is  given  by  (2.15),  j  =  \J  —  1,  and  <f>g  (•)  is  defined  as 


^+00 


<f>g  (z)  =  /  exp  (0z)  dPq  (6)  -  1 


(2.18) 


when  it  exists. 


Proof.  We  can  verify  that 


exp 


^  Ob'*  ( t  -  Tfc) 

fc=l 


E 


^  (t  -  Tfc) 


fc=l 


(2.19) 


by  noting  that  the  integral  on  the  right  side  is  a  Gaussian  random  variable.  Sub¬ 
stituting  (2.15),  (2.16),  and  (2.19)  into  (2.10),  and  noting  that  and  {£t}  are 
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statistically  independent,  we  get 


A?}  (ti,  . . . ,  Tn,  0 Qn)  =  exp  ( 8kzTk )  E 


k= 1 


n  exp  {j9k£Tk) 


k=  1 


=  E 


n  exp  {0k  ( Zt> fc  +  ten) } 


k= 1 


(2.20) 


Since  with  probability  1,  the  sample  paths  of  {zt}  are  bounded,  the  rest  of  this 
proof  will  be  presented  for  bounded  sample  paths  of  {zt}.  Thus,  any  result  which 
is  obtained  based  on  the  boundedness  of  {zt}  will  be  stated  with  probability  1. 

Let  3? {•}  and  A{-}  denote  the  “real  part”  and  the  “imaginary  part”,  respec¬ 
tively.  We  can  write 


*  n  exp  {0k  ( zTk  +  j£rfc)} 


,fc=i 


€ 


n exp  i°k  ) 


k=  1 


jQexp  (i 0kzTk ) 


k= 1 


Assuming  that  (•)  exist,  this  leads  to 


nT  /»+ oo 

1 0  j- oo 
n 


r»+oo 


E 


*  n  exp  {ek  ( zTk  +  j£Tfc)} 


,fc=i 


Y[  dP<l  (°k)  n  X^drk  <  I  I 


n  /  /■+!» 


k= 1 


k= 1 


fc=l 


exp  (9kZTk)  Pg  (9k)  )  XrkdTk 
\ 

Xt($q  (zt)  +  l)dt 


'0  \</-oo 

>T 


For  every  bounded  sample  paths  of  {zt},  the  right  side  of  the  inequality  above  is 
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bounded,  thus  we  conclude  from  Fubini’s  theorem  [37]  that 


pT  /»+oo 

'  0  J-oo 


r»+oo 


E 


1 


=  E 


exp  {ek  (zTk  +jCrk)}  |  |  @T 

n 

■  n  dPg  (6k)  JJ  XTkdr, 

k= 1  fc=l 

exp  {ek  ( zTk  +  j£Tfc)} 

n  n 

Y[dPq{ek)Y[\Tkdrk\^T 


/■  7  />+oo  r+oo  [  n 

'0  J —oo  J—  oo 


fe=l 


fc=l 


fc=l 


A  similar  argument  can  be  applied  to  the  imaginary  part  of 


n  exp  {dk  (zTk  +j£rk)} 

k=  1 


in  order  to  show  that 


/»T  /»+oo 

'0  J-oo 

=  E 


/+oo 

-oo 

r ■ 

1 0 


E 


n exp  + } 


.  fc= 1 
•T  /»+oo 


dPq  (6k)  XTkdrk 


k= 1 


fc=l 


r+oo 


'  0  J  — oo  J  — oo 


n exp  r-  r + } 


k= 1 


=  E 


(^t  +  j£t)  +  l)dt 


fc=i 


fc=l 


^Vt 


Substituting  (2.20)  into  (2.9)  and  using  the  result  above,  we  obtain 


A(rT;{A  t},T)  = 


exp  J  Xtdt'j  J-E  [Zn\dVT] 
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where  the  random  variable  Z  is  defined  as 


Z  —  /  {zt  +  j^t)  +  l)dt. 

Jo 


We  note  that  \Z\  satishes  the  inequality 


\Z\  ^  /  A t  (zt  +  j£t)  +  1  dt 


1 0 


r»+00 


exp  {e  (zt  +  j'6) } 


dPg  id)  dt 


nT  /»+oo 

^  /  Xt  exp  (9zt)  dPq  ( 9 )  dt 
J  0  J — oo 

=  [  \t($q{zt)  +  i)dt±\z\u. 

Jo 


From  this  inequality  and  the  fact  that  \Z\U  <  oo  for  every  bounded  sample  path 
of  {zt},  we  conclude  that  \Z\  <  oo.  Let  N  be  an  integer  and  dehne  the  sequence  of 
complex- valued  random  variables  XN  as 


N 


XN  -  ^ 


n= 0 


n\ 


From  \Z\  <  oo,  we  conclude  that  XN  — >  exp(Z),  and  as  consequence,  K{Xtv} 
3?  {exp  (Z)}  and  A  {Wat}  — >  Tf  {exp  (Z)}.  On  the  other  hand  we  have 


N 


i.wi  <Es!izr<E^izr<E^izi:=exp  (izi«> 


n= 0 


n= 0 


n= 0 


This  leads  to  |9?{Xtv}|  ^  |Ahv|  ^  exp(|Z|u)  and  |A{Ahv}|  ^  |Ajv|  ^  exp(|Z|u). 
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Also,  for  every  bounded  sample  path  of  {zt},  we  have 


E  [exp  (\Z\U)  | <&t)  =  exp  (\Z\U)  <  oo. 


Thus,  we  can  apply  the  dominated  convergence  theorem  [37]  separately  to  K{Xjv} 
and  A{Xjv}  in  order  to  show  that 


^2  A  E  [z"Wt\  =  lim  E  [XN\ &t\  =  E  [exp  (Z)  \<&T\ . 

n= 0 


This  completes  the  proof. 


□ 


Corollary  2.3.3.  For  the  case  of  a  random  a,  under  the  assumptions  of  Theo¬ 
rem  2.3.2  and  assuming  that  {£t}  is  independent  of  a,  with  probability  1,  we  have 


Aa(&r]{st},T)  =  E 


st$q  (zt  +  jit)  dt 


exp 


( zt  +  j^)  dt 


(2.21) 


where  (•)  is  defined  as 


z  = 


exp  (az)  dPa  ( a ) 


(2.22) 


when  it  exists. 


Proof.  Let  jZf  denote  the  a- algebra  generated  by  {£t}  during  [0,  T\.  Using  the  law 


of  total  expectation,  we  write 


Aq  {&T\{st},T) 


E 

exp  ^ 

J  ( ast  +  n)  <f>q  (zt  +  j£t)  dt^j 

a 

E 

exp  ( 

J  ( ast  +  fi)  <f>q  (zt  +  j£t)  dt^j 

& T 1 

r  r°° 

(  rT  \ 

L/o 

exp  ( 

J  ( ast  +  n)  <£>9  ( zt  +  j£t)  dt  J 

dPa  (a) 

where  the  last  equality  is  concluded  from  the  fact  that  a  is  independent  of 
and  ,%t-  Since  the  sample  paths  of  {zt}  and  {^}  are  almost  surely  bounded,  as¬ 
suming  that  (2)  exists  for  every  bounded  z,  this  leads  to  (2.21)  with  probabil¬ 
ity  1.  □ 

The  following  theorem  provides  an  alternative  expression  for  (2.9),  which  later 
will  be  used  to  establish  a  lower  bound  on  (2.9). 

Theorem  2.3.3.  Fix  a  and  let  qk  =  q,  k  =  1,  2,  3, ...  be  a  constant.  Then,  with 
probability  1,  (2.9)  can  be  expressed  as 


A  (Sfc{At},T) 


=  exp 


A t$q  ( zt )  dt 


E 


(2.23) 


where  $5  (•)  is  defined  as 


(2)  =  -  1 


(2.24) 
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and  {t„}nLii  is  a  doubly  stochastic  Poisson  point  process  with  the  rate 

A*  =  Xteqzt.  (2.25) 

Proof.  Noting  that  qj~  =  q  is  a  constant,  we  substitute  (2.10),  (2.24),  and  (2.25) 
into  (2.12)  and  rewrite  it  as 


For  bounded  sample  paths  of  { AJ } ,  it  is  easy  to  verify  that  this  expression  is  equiv¬ 
alent  to  (2.23).  Since  with  probability  1,  the  sample  paths  of  {^},  and  as  a  conse¬ 
quence  the  sample  paths  of  {Ajf},  are  bounded,  (2.23)  holds  with  probability  1.  □ 

2.4  Upper  and  Lower  Bounds  on  A  (•) 

In  this  section,  we  determine  simple  upper  and  lower  bounds  for  A  {A*} ,  T). 
Since  the  expressions  obtained  for  A  (^;  { At}  ,  T)  in  Section  2.3  are  complicated, 
these  bounds  are  useful  in  studying  the  behavior  of  A  (£%;  {At}  ,  T)  for  some  impor¬ 
tant  limiting  cases.  In  addition,  under  conditions  close  to  these  limiting  cases,  the 
bounds  can  be  used  to  approximate  A  (^;  {At}  ,  T).  Throughout  this  section,  we 
shall  assume  that  A t  is  a  deterministic  function. 

Theorem  2.4.1.  Assume  that  qk  ^  0,  k  —  1,  2,  3, . . .  and  n  (f)  ^  0  for  every  t  ^  0. 
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Define  the  functional 


B\  (^t;  {At}  ,  T)  =  exp 


Af<f>*  (zt,  bt)  dt 


(2.26) 


where 

bt  =  "4  [  7T2  (r  -  t)  dr  (2.27) 

a  Jo 

and  assuming  that  <£>*  (z,  b )  exists  for  every  bounded  z  and  b ,  it  is  defined  as 


9z-  -9% 


dPg  (9) 


- 1. 


(2.28) 


Then,  with  probability  1,  we  have 


A  (&T-  {At}  ,  T)  ^  Bux  {&T-  {At}  ,  T) .  (2.29) 


Proof.  Under  the  assumptions  of  the  theorem,  the  second  term  on  the  right  side  of 


n  \  2  n 

E  (t  ~  rk)  =s^Je'W 


{t  —  Tk)  +  OkOlTT  ( t  ~  Tk)  7 r(t~  Ti ) 

k— 1  1=1 


l^k 


is  nonnegative.  This  indicates  that  the  left  side  is  not  greater  than  the  first  term  on 
the  right  side.  Applying  this  result  to  (2.10),  we  have 
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Substituting  this  inequality  into  (2.9),  we  get 


A  (&t)  {At}  ,  T)  <  exp  Xtdtj  ^  ^  At(<f>*  (zt,bt)  +  l)dt 

For  every  bounded  sample  path  of  {zt},  the  right  side  of  this  inequality  converges 
to  (2.26).  Since  with  probability  1,  the  sample  paths  of  {zt}  are  bounded,  this 
indicates  that  (2.29)  holds  with  probability  1.  □ 

Remark  2.4.1.  The  more  conservative  upper  bound 

A  ($r;  {A*}  ,  T)  ^  B\  {&T-  {A*}  ,  T)  <  exp  ^  jf  Xt%  (zt)  <uj  (2.30) 


does  not  require  the  assumptions  of  Theorem  2.4.1.  The  proof  follows  from 


<  1. 


Theorem  2.3.3  and  Lemma  2.4.1  stated  below  can  be  used  to  establish  a  lower 
bound  on  A  (•).  This  lower  bound  will  be  presented  in  Theorem  2.4.2. 

Lemma  2.4.1.  Let  be  a  Poisson  point  process  with  rate  At.  Then,  we  have 


r)  drdt  + 


At7 t  ( t 


t )  dr  )  dt. 


(2.31) 
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is  a  lower  bound  for  A  (^;  {A*}  ,  T).  Here  (•)  is  the  derivative  of  <f>5  (•)  which  is 
defined  by  (2.24). 

Proof.  Noting  that  exponential  is  a  convex  function,  we  apply  Jensen’s  inequality 
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to  (2.23)  to  get 


A  (&fr;  {At}  ,  T)  >  exp  (^  j  \t$q  (zt)  dtj 

rp  .  A^i  .  2 

J0  yY^^{t-tn)J  dt 


exp  ’  -^E 


<3^r 


Next,  we  apply  Lemma  2.4.1  to  the  right  side  of  this  inequality  to  obtain  (2.32).  □ 


Corollary  2.4.1.  Linder  the  assumptions  of  Theorem  2.3.3,  with  probability  1,  we 


have 


A(^;{At},T)  A(^r;{At},T) 

5f(^T;{At},T)  B?{&T]{\t},T) 


Proof.  From  (2.26),  (2.30),  and  (2.32),  with  probability  1,  we  have 


(2.33) 


f  T  pT 


exp  <  — 


2<^2  Jo  Jo 


A T&q  (zT)  tx2  ( t  —  r)  drdt 

1 


2a2 


f  AT<f>f  (zT)  7 t  (t  —  t)  dr\  dt 
o  / 

B[(&T;{\t},T)  ^  A(^t;{A ,},T) 


fif(^r;{At},T)^fif(^r;{At},T) 


O 


As  ct  — >  ex),  for  every  bounded  sample  path  of  {zt},  the  expression  on  the  left  side 
of  this  inequality  tends  to  1.  This  proves  that  (2.33)  holds  with  probability  1.  □ 

Remark  2.4.2.  Using  Theorems  2.4.1  and  2.4.2  and  (2.14),  we  can  find  upper  and 
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lower  bounds  on  Aa  {&t\  {st}  ,  T)  as 


Aq  ($t;  {st}  ,T)>  B[  (&T-  {ast  +  //}  ,  T)  dPa  (a) 


Aa  {WT-  {sf}  ,  T)  ^  /  £?}  (&t;  +  A},T)  dPa  (a) . 


In  the  remainder  of  this  section,  we  adopt  a  different  approach  to  establish 
upper  and  lower  bounds  on  A  (^;  { At}  ,  T).  Theorems  2.4.3  and  2.4.4  below  explain 
this  approach. 

Theorem  2.4.3.  Assume  that  —  Q,  k  =  1,  2,  3, ...  is  a  constant  and  7r  (t)  ^  0  for 
every  t  ^  0.  Let  be  the  solution  of  the  integral  equation 

x?  =  1  +  [  7 u{t,  r)  XrCrf  (zT)  xy,r  (2.34) 

Jo 

where  /  (z)  =  e^z,  ct  =  exp  (—q2bt/ 2),  and 

Au  (ti,t2)  =  exp  J  71  (t  ~  ri)  n  (t  ~  t2)  dt'j  .  (2.35) 


Then,  with  probability  1,  we  have 


A  (9fc  {At}  ,  T)  ^  Bu2  {JVt-  {At}  ,  T)  ^  5}  (9fc  {At}  ,  T) 


(2.36) 


where  B2  (•)  is  dehned  as 


B2  (£^r;  {At}  ,  T)  =  X?  exp 


(2.37) 
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Proof.  We  first  define 


Bf  (<&r;  {At}  ,  T)  =  exp  A tdtj  •  (l  +  XT1cTJ  (zT1)  drx 

QQ  rj~i  { 

+ e  f  r  f  ^ TnCTnf  (ZTn)  d'Tn  |  ^  {jk+ 1)  Tk)  X TkCrkf  (^rfc)  dTk  J  (2.38) 


_9  </ 0  Jo  Jo 


n= 2 


fc=l 


and  show  that  this  expression  can  be  written  as 


Bf  (&T-,  {At}  ,T)  —  Yf  exp  (  —  /  A tdt 

'o 


(2.39) 


where 

*?  =  !+/  Xtctf  (zt)  Xfdt. 

Jo 

To  achieve  this  goal,  for  every  bounded  sample  path  of  {zt},  every  t  >  0,  and  every 
integer  N  ^  2,  we  define 


N 


z?  =  i  +  J2 


n= 2 


,  n— 1 


n 7,1  ^Tfc+i’ w rfrfc 


fc=i 


Tn=t 


We  note  that  for  every  fixed  t  >  0,  is  increasing  in  iV  and  satisfies 


^  exp  (  /  XTcTf  (zT)  dr 

>0 


This  shows  that  for  every  bounded  sample  path  of  {zt},  Zf°  =  lirri/v-xx:  Zf1  exists. 
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For  every  N  ^  3,  we  can  write 


Zt  =  1  +  /  7“  (U  t)  XTcTf  (zT)  Z f  1c?r 


which  leads  to 


Zf°°  =  1  +  lim  [  7 u  (t,  t )  \TcTf  (zT)  Z N  1d,T 

Af^oo  J0 

=  1  +  f  7 u(t,  t)  XTcTf  (zT)  Z™dr. 

Jo 

Here,  the  second  equality  is  concluded  from  the  monotone  convergence  theorem  [38]. 
This  result  indicates  that  X™  =  Zt°°  for  t  >  0.  Using  the  monotone  convergence 
theorem,  we  express  (2.38)  as 


(/  J  I  \  .  <  J  I  \ 

-J  Xtdtj  1  +  J  XtCtf(zt)ZtNdtj 

-  J  Xtdtj  U 1  +  J  X tctf  (zt)  Z^dtj 

which  proves  that  (2.39)  holds.  Expressing  Yj f  as 

Yt  =  1  +  [T  7“  (T,  t )  XtCtf  (zt)  X?dt  +  [T  (1  -  7“  (T,  t))  Xtctf  (zt)  X?dt 
Jo  Jo 

and  noting  that  (T,  t)  =  1  for  every  0  ^  t  ^  T,  we  find  that  =  ^-t-  This 
verihes  that  with  probability  1,  (2.38)  is  equal  to  (2.37). 

To  prove  the  first  inequality  (from  left)  of  (2.36),  we  use  (2.38)  and  (2.12)  to 
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construct 


=  exp 


Xn(n,r2,...,  Tn)  JJ  A  TJ  (zTk)  drk 

k= 1 


(2.40) 


where  Xn  (')  is  given  by 

n  n—1  (  q2  rT  f  n  A 2 

Xn  (Ti,  Tn)  =  JJ  cTk  JJ  7U  (Tk+1,  Tfc)  exp  %  —  —  /  (  ^  VT  (t  -  rk)  J  dt 

k= 1  fc=l  l  17  Ao  V  fc=1  / 

It  is  easy  to  show  that  Xn  (ri;  r2,  •  •  • ,  t„)  ^  0,  by  rearranging  the  expression  above  as 

Xn  (ri,T2,...,Tn) 

n  n-l  /  (  2  rT 

=  II  Ct*  II  (T*+l>  T^)  1  _  eXP^  TT  (t  ^  Ti)  TT  (t  -  Tj )  dt 

k= 1  fc=l  \  l  ^0  ^(jj) 

where  ^  (i,  j)  =  {( i,j )  :  |i  —  j|y^l}.  This  verihes  that  (2.40)  is  nonnegative  and 

completes  the  proof.  The  second  inequality  of  (2.36)  follows  from  (•,  •)  ^  1.  □ 

Theorem  2.4.4.  Assume  that  qk  —  q,  k  —  1,  2,  3, ...  is  a  constant,  n  (t)  ^  0  for 
every  t  ^  0,  and  7r  (t)  —  0  for  t  >  e,  where  e  >  0  is  a  known  constant.  Let  Xf  and  Yte 
be  the  solutions  of 


xi 

Y{ 


1  +  [  Y  (t,  T)  A TCTf  (ZT)  Xerd,T 

Jo 

1  +  [  ATCT/  (zt)  XlTdr 
Jo 


(2.41) 

(2.42) 


38 


where  Y  {.'■>')  is  defined  such  that 


Y  (Ti,T2)  =  Y  (r2>  Tl) 


1  0  ^  Tl  ^  max  (0,  r2  —  e) 

0  max  (0,  r2  —  e)  <  Ti  ^  r2. 


Then,  with  probability  1,  we  have 


%  {&r\  {At}  ,T)^B*  (WT;  {Af}  ,T)  ^  A  {At}  ,  T) 


(2.43) 


where 


bU»t;{\},T) 


bU»t;{\},T) 


(2.44) 

(2.45) 


Proof.  The  proof  is  similar  to  the  proof  of  Theorem  2.4.3,  by  replacing  superscript  u 
with  £.  For  this  case,  X,h(')  *s  given  by 

n  /  n  n  j  —  1 

Yn  (n,  t2,  . . . ,  rn) = n  cTk  ( n  ^  Yji  Tj- 1)  -  n  n  ^  t <) 

k= 1  Vi =2  i=2  i=l 

From  the  definitions  of  7“  (•,  •)  and  7^  (•,  •)  and  the  assumptions  of  the  theorem,  it 
is  easy  to  show  that  for  0  ^  77  ^  r2  ^  ^  Tj,  we  have 

i- 1 

7Vi>Ti-i)  <  I]V  (rTT*) 

which  leads  to  xi,  (ri;  t2,  . . . ,  Tn)  ^  0,  n  ^  2.  This  proves  the  inequality  on  the  right 
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side  of  (2.43).  The  inequality  on  the  left  side  follows  from  Xf  ^  Y?. 


□ 


The  following  theorem  provides  a  closed  form  expression  for  a  lower  bound 
on  (2.45). 

Theorem  2.4.5.  Let  the  assumptions  of  Theorem  2.4.4  hold  and  define 
B{  {$rT\  {At}  ,r)=e*p{-jf  A  tdt 

+  Jo  ^  (X  +  Jo  (X  -'Yt(t>T))  ArcT/ (zT)  dtj.  (2.46) 

Then,  with  probability  1,  we  have 


(2.47) 


Proof.  Using  (2.41),  we  can  write 


xf  =  l+  XrcTf(zT)XerdT-  (l  XrCrf  (zr)  X*dr. 


From  (2.42)  and  the  fact  that  Xf  is  increasing  in  t,  we  find  that 


X{  >  Y{  -  X{  /  (1-Y  (t,  r))  A TcTf  (zT)  dr. 


We  know  from  (2.42)  that  X%  =  Yf/Xtctf  (zt),  thus  substituting  this  result  into  the 
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inequality  above  and  rearranging  it,  we  get 


_ W  (2,) _ 

l+[  (l  -7f  (f,  r))  \TCrf  (zr)d.T 

Jo 

Upon  integrating  both  sides  of  this  inequality  over  [0,  T],  we  obtain  (2.47).  □ 

2.5  Behavior  of  A  (•)  for  a  Small  Pulse  Duration 

In  this  section,  we  study  the  behavior  of  A  (^;  {At}  ,  T)  when  the  pulse  duration  e 
tends  to  zero.  Before  addressing  the  main  topic,  we  discuss  a  technical  difficulty 
arising  from  e  — >  0,  namely 


lim  E  [A  ($fr;  {At}  ,  T)]  —  oo.  (2.48) 

€ — >0 

As  mentioned  in  Section  2.2.1,  n  (•)  has  a  unit  area,  thus  it  can  be  expressed 
as  7 r  (t)  =  e_17r(t/e),  where  ff  (•)  has  a  unit  area  and  |tt  (-)  |  is  bounded  for  every 
1^0.  In  addition,  the  condition  7r  (t)  —  0,  t  >  e  implies  that  7r  (t)  —  0,  t  >  1.  We 
define  v  (ri,  r2)  =  1/7“  (iq,  r2),  where  7“  (•,  •)  is  given  by  (2.35).  This  function  can 
be  expressed  as 

v  (ri,  t2)  =  exp  IT  (ri,  r2)^j  (2.49) 

where  II e  (•,  •)  is  deffired  as 


fT/e 

ne  (ti,  r2)  =  /  n  (t  —  Ti/e)  n  (t  —  r2/e)  dt. 
Jo 
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Let  7r  (•)  be  nonnegative  over  [0,1].  Then,  for  some  subset  of  \t\  —  r2|  ^  e  with 
nonzero  Lebesgue  measure,  we  have  IT(ti,T2)  >  0.  Referring  to  (2.49),  this  im¬ 
plies  that  lime_>0  ev  (r1?  T2)  =  oo  over  a  subset  of  \t\  —  T2  ^  e.  Thus,  for  every 
function  At  >  0,  t  e  [0,  T]  and  every  t*  G  [0,  T],  we  have 

lim  [  \tu  (t,  t*)  dt  =  oo.  (2.50) 

Jo 

We  note  that  the  stochastic  process  {ctf  (zt)}J=0  can  be  expressed  as1 

nt 

Ctf  (zt)  =  (2.51) 

n= 1 

where  {pt}  is  defined  as 


Pt  =  exp 


t )  dwT 


It  is  easy  to  verify  that  for  every  fixed  t  ^  0,  the  lognormal  random  variable  pt  has 
a  unit  mean.  When  e  — >  0,  since  the  pulses  in  {zt}  do  not  overlap  each  other,  (2.51) 
can  be  written  as 


nt 

Ctf  ( Zt )  =  pt  +  pt^2(iy  C t ,  tn)  -  1) .  (2.52) 

n— 1 


We  remind  from  Theorem  2.4.4  that  X[  ^  1,  t  e  [0,  T\.  Thus,  using  (2.42), 


1We  take  11™=!  (')  equal  1. 
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(2.52),  and  E  [pt]  =  1,  we  can  write 


E  \Y*  1  ^  E 


1+  /  A  tctf(zt)dt 


—  1  “I-  /  A +dt  + 


A/E 


wT 

X^MMn)  -  1) 


n= 1 


df. 


Applying  (2.50)  to  this  inequality,  we  get  limM0  E  \Yf\  =  oo,  which  together 
with  (2.43)  and  (2.45)  prove  (2.48). 

In  spite  of  the  difficulty  mentioned  above,  we  present  useful  results  below  for 
the  case  of  e  — >  0.  These  results  provide  appropriate  means  for  approximating  A  (•) 
when  e  is  small. 


Theorem  2.5.1.  Assume  that  the  unit  area  function  7r(-)  has  the  property  that 


7i  (t)  ^  0  for  t  G  [0,  e]  and  7T  (t)  —  0  beyond  this  interval.  Then,  with  probability  1, 


we  have 


A(gfr;{At},r)  A(^;{At},r) 
™  Bi  (&t‘,  {A*}  ,  T)  ™  Bf  (9fc  {At}  ,  T ) 


1. 


Proof.  From  (2.34),  (2.41),  and  the  fact  that  Y(t1,t-2)  =  Y  (ti,  t2)  7“  (r1;  t2),  we 
can  write 


P*  ~  Jfu 


1 

Xf 


Y  (t,  r)  A TcTf  (zT)  {XhT 


K)  dr 


+  [  (l-Y(t,T))^U(t,T)XTCTf(zT)XfTdT 
Jo 
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The  increasing  property  of  X“  and  the  facts  that  Xf  ^  Xf  and  7“  (•,  •)  ^  1,  result  in 


Pt  ^  /  ^r^rf  (^r)  I3tcLt  +  (5$ 


(2.53) 


where  <5t  is  defined  as 


-  7^  (*,  t))  7“  (t,  r)  XTcTf  (zT)  dr. 


Using  (2.51)  and  E  [pt]  =  1,  we  find  the  mean  of  St  as 


E  \8A  =  E 


Nt 


(1  -  Y  (t,  t))  7“  (t,  r)  At  JJ  1/  (r,  tn)  dr 


n=l 


(2.54) 


For  a  fixed  sample  path  of  {in}^=  1,  let  P  and  t"  be,  respectively,  the  closest  and  the 
next  closest  tn  to  t.  Assume  that  \t—t'\  >  0.  Then,  for  e  <  \t— t7 1 /2,  over  the  interval 
r  G  [max  (0,  t  —  e)  ,t]  in  which  1  —  Y  (t,  r)  7^  0,  we  have  [Q^=i  v  (T>  tn)  —  1-  Now,  we 
assume  that  P  =  t.  Then,  noting  that  Pr  [  |P  — t"\  >  0}  =  1  for  e  <  |f'  —  t"|/2,  we 
have  7“  (£,  r)  n^=i  ^  (u  tn)  =  Y  (U  r)  1/  (r,  t')  =  1  for  r  G  [max  (0,  t  —  e) ,  f].  These 
results  indicate  that  the  integrand  in  (2.54)  is  almost  surely  bounded,  and  as  a 
consequence,  the  integral  tends  to  0  as  e  — >  0.  This  proves  that  lime^0  E  [<5t]  =  0 
for  every  t  ^  0,  which  in  combination  with  5t  ^  0  lead  to  linq^Q  8t  =  0,  with 
probability  1. 

By  letting  St  —  0  on  the  right  side  of  (2.53)  and  noting  that  (3q  =  0,  we  find 
that  /3t  ^  0  for  t  ^  0.  On  the  other  hand,  we  know  from  definition  that  /3t  ^  0,  thus 
we  conclude  that  linp^0  fjt  =  0,  almost  surely  for  every  t  ^  0.  Finally,  we  complete 
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the  proof  by  applying  linp^0  XfjXf  =  1  to  the  inequality 


Bi  {®r\  {A,}  ,  T)  ^  A  {A,}  ,  T)  ^  Bf  (0fc  {A J  ,  T) . 


□ 

Theorem  2.5.2.  Assume  that  for  0  ^  r  ^  t  ^  T,  the  mapping  7  (•,  •)  satishes  the 
inequality 

7^  (t,  r)  ^  7  (t,  r)  ^  7“  (t,  r) .  (2.55) 

Let  Xt  be  the  solution  of  the  integral  equation 

Xt  =  1  +  [  7  (t,  t)  A TcTf  (zT)  XTdr  (2.56) 

Jo 

and  define 

B  (&r,  {A*}  ,  T )  =  XT  exp  C-  J  A tdtj  .  (2.57) 

Then,  under  the  assumptions  of  Theorem  2.5.1,  with  probability  1,  we  have 


lim 


A(rr;{A  t},T) 
0  B  (<&t;  {At}  ,T) 


1. 


Proof.  The  proof  follows  from  the  proof  of  Theorem  2.5.1  and  Xf  ^  X t  ^  A If.  □ 

Remark  2.5.1.  The  proof  of  Theorem  2.5.1  suggests  that  B  { A* }  ,  T)  in  (2.57) 

is  an  appropriate  approximation  for  A  ( {At}  ,  T),  if  e  is  small  enough  to  ensure 
that  most  sample  paths  of  {yt}  are  free  from  overlapping  pulses. 
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Theorem  2.5.3.  Let  Ai  (t)  and  A2  (t)  be  nonnegative  functions  defined  over  [0,  T]. 
Then,  under  the  assumptions  of  Theorem  2.5.1,  with  probability  1,  we  have 


A(^t;{A  imr) 
-oA(fT;{A2  (t)},T) 


(  T  \  nt 

J  (A2  (t)  -  Ai  (t))dfj 


Ai ( tn ) 

A2  ( tn ) 


(2.58) 


Proof.  Consider  a  sample  path  of  the  stochastic  process  {ctf  ( zt )}  defined  by  (2.51) 
and  assume  that  e  is  much  smaller  than  the  minimum  distance  between  two  suc¬ 
cessive  occurrence  times  tn  and  tn+ 1.  This  ensures  that  the  pulses  in  (2.51)  do  not 
overlap  each  other.  Under  this  condition,  the  goal  is  to  solve  the  integral  equa¬ 
tion  (2.41),  whose  simplified  form  is  given  by 


ft~e 

X£  =  1  +  /  Kc-rf  (zT)  X£rdr. 


(2.59) 


The  solution  of  this  equation  encounters  a  “big  jump”  during  [tn,tn  +  2e],  n  = 
1,2,...,  Nt.  Therefore,  we  have  to  solve  (2.59)  separately  for  two  cases:  inside  the 
intervals  [tn,  tn  +  2e]  and  outside  these  intervals. 

For  t  G  (fn,  tn  +  e],  we  rewrite  the  integral  equation  (2.59)  as 

A  =  X[n  +  f  e  \TpTv  (r,  tn)  X£dr 

and  solve  it  as  follows.  Since  XT,  pT,  and  Ar  are  (almost  surely)  continuous  over 
r  6  (4  -  e,  tn] ,  the  solution  of  this  equation  can  be  approximated  by 

xt  =  XL  +  ^tnPtnxtn  [  v  (r,  tn)  dr,  t  e  ( tn ,  tn  +  e].  (2.60) 

J  tn — £ 
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Note  that  as  e  — »  0,  this  approximation  tends  to  the  exact  solution.  Also,  the 
solution  of  (2.59)  for  X[n+2e  can  be  approximated  as 


(»tn+€ 


tn-\-2c 


=  K+e  + 


—  Xt  +e  +  A  tnptn 


A tptv  (t,tn)  X;dt 

v  (t,  tn )  Xfdt. 


Upon  substituting  (2.60)  into  this  expression,  we  get 


K+2e  =  K  1  +  A  tnPt 


(»tn+e 


V  (t ,  tn)  dt 


'tn-e 


f*tn-\~€  pt—e 


+  A?  pl 

Ln 1  '■n 


v{t,tn)v(r,  tn)  drdt 


'tn-e 


For  sake  of  simplicity,  we  express  this  result  as 


yt  _  yt 
yvtn+ 2e  —  -^tn 


1  +  A tnPtnJ  (e)  +  a tnPtnJ  (e) 


(2.61) 


where  J  (e)  and  J  (e)  are  defined  as 


J  (e) 

J(e) 


'o 


-2e 

v  (t,  e)  dt 

p2e  pt—e 

e  JO 


v  ( t ,  e)  v  (r,  e)  drdt. 


In  order  to  obtain  Xf  for  t  G  ( tn  +  2e,  tn+i],  we  need  to  solve  the  integral 
equation 

X!  =  Xl+2e  +  f  6  \TpTX[dr. 

J  tn-\-e 


47 


As  e  — >  0,  the  solution  of  this  equation  at  t  =  tn+l  is  given  by 


Xtn+2e  exP 


(2.62) 


Starting  from 


and  using  (2.61)  and  (2.62)  in  a  recursive  procedure,  it  is  straightforward  to  show 


that 


XT  =  exp 


nt 

(l  +  ^nPtnJ  (6)  +  ^ tnPtnJ  (6))  • 

n— 1 


This  result  leads  to 


A(^t;{A  !(t)},r) 
A(^t;{A2 


U  (e)  exp 


(A2  (t)  -  Ai  (t))dt 


Nt 

n 

n= 1 


Ai  (tn) 
A2  (tn) 


(2.63) 


where  U  (e)  is  defined  as 


We  can  show  that  lime^0  U  (e)  =  1,  by  the  fact  that  for  every  t  e  [0,  T],  with  proba¬ 
bility  1,  we  have  lime^0  Pt  —  0,  lim€^0(pt  J  (e))  1  =  0,  and  lime_*0  ptJ  (e)  /  J  (e)  =  0. 
This  proves  (2.58),  upon  being  applied  to  (2.63).  □ 
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Remark  2.5.2.  As  e  — >  0,  the  stochastic  differential  equation  (2.2)  tends  to 


dyt  =  qdNt  +  er  dwt. 


(2.64) 


On  the  other  hand,  (2.58)  is  the  likelihood  ratio  function  associated  with  a  channel 
whose  output  yt  is  described  by 


dyt  =  qdNt. 


(2.65) 


We  conclude  from  these  facts  that  subject  to  the  detection  problem  of  Section  2.2.3, 
the  channels  described  by  (2.64)  and  (2.65)  are  equivalent,  in  the  sense  that  they 
have  equal  probability  of  error.  In  addition,  since  this  argument  is  valid  for  every  T, 
every  integer  M,  and  every  set  of  waveforms  { Ai  (t) ,  A2  (£),...,  A m  (t)},  we  argue 
that  (2.64)  and  (2.65)  have  identical  channel  capacity. 

Remark  2.5.3.  We  consider  the  case  that  a  —  a  (e)  is  a  decreasing  function  of  e  such 
that  lim^o  er  (e)  =  00.  We  can  verify  that  the  results  of  Theorems  2.5.1  and  2.5.3 
still  hold,  if  for  every  0  <  e  <  1  and  some  5  >  0,  a  (e)  satisfies  the  inequality 


u(e)  < 


g  /  ng(o,o) 
1  +  5  V  e  In  e 


Under  this  condition,  Theorem  2.5.3  implies  that  as  e  — >  0,  the  probability  of  error 
of  channel  (2.2)  tends  to  the  probability  of  error  of  the  ideal  channel  (2.65).  On  the 
other  hand,  any  linear  filtering  scheme  fails  to  reconstruct  the  transmitted  message, 


49 


since  lime^0  a  (e)  =  oo  implies  that  the  signal-to-noise-ratio  at  the  output  of  that 
filter  will  be  0. 

2.6  Approximate  Implementation 

The  formal  solutions  introduced  in  Section  2.3  can  be  implemented  only  by  means  of 
infinite-dimensional  systems;  however,  under  certain  assumptions,  finite-dimensional 
approximations  for  A(^;{At},T)  can  be  derived  from  (2.17)  and  the  results  of 
Sections  2.4  and  2.5.  The  goal  of  this  section  is  to  determine  such  approximate 
implementations  and  discuss  the  conditions  under  which  they  are  useful.  We  shall 
keep  the  assumption  of  Section  2.5  in  which  tt  (•)  is  of  finite  duration,  i.e.,  7T  (t)  —  0 
for  t  [0,  e] .  The  interpretation  of  this  assumption  is  that  most  of  the  pulse  energy 
is  concentrated  in  [0,  e]  such  that  the  energy  beyond  this  interval  is  negligible. 

According  to  (2.15),  the  stochastic  process  {zt}  can  be  only  implemented  us¬ 
ing  an  anticausal  system;  however,  due  to  the  assumption  above,  the  stochastic 
process  zt  =  zt-£,  t  e  [e,  T  +  e]  can  be  implemented  by  a  causal,  linear,  time- varying 
system.  This  system  is  characterized  by 

1  f+°°  1 

zt  =  —  /  h  (t,  r)  dyT  =  —  /  h  (t,  r)  dyT  (2.66) 

°  J- oo  °  Jo 

where  the  impulse  response  h  (t,  r)  is  given  by 

h(t,r )  =  7T  (e  —  (t  —  t))  u  (T  —  r) 
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with  u  (•)  denoting  the  unit  step  function.  Note  that  (2.9)  and  (2.17)  remain  un¬ 
changed  if  we  shift  up  the  limits  of  integration  by  e  (i.e.  replacing  0  with  e  and  T 
with  T  +  e)  and  in  the  same  time  replace  zt  with  Zt,  e.g.,  (2.17)  can  be  written  as 


A(rr;{M,T)  =  E 


exp 


fT+e 


A t-e^q  {zt  +  j^t-e)  dt 


<&T 


This  shows  that  by  accepting  a  delay  of  e  in  the  decision  time,  we  can  implement  (2.9) 
and  (2.17)  using  the  causal  system  (2.66).  For  sake  of  simplicity,  in  the  rest  of  this 
section,  we  keep  the  time  frame  [0,  T],  while  we  know  how  to  replace  it  with  [e,  T  +  e] 
in  order  to  implement  zt  by  means  of  the  causal  filter  (2.66). 

Throughout  this  section,  we  introduce  two  categories  of  approximation  for  A  (•). 
The  first  category  will  be  derived  from  expression  (2.17),  while  the  second  one  is 
based  on  Theorem  2.5.2.  While  the  Erst  category  can  be  used  for  the  general  case 
of  a  and  {qk},  the  second  category  is  only  applicable  to  the  case  that  a  and  qk  =  q 
are  deterministic  values. 


2.6.1  Approximation:  Category  I 

We  derive  our  first  approximation  for  A  (•)  from  (2.17)  by  approximating 


(zt  +  j£t)  ^  $9  (zt) 


(2.67) 
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which  results  in 


A(^T;{At},T)~exp 


At$,  (zt)  dt 


(2.68) 


For  the  case  of  a  random  a,  using  (2.14),  (2.22),  and  (2.68)  we  can  write 


A  a(^T;M,T)~$a 


st$q  (zt)  dt  exp  /  /1<F9  (zt)  dt 


(2.69) 


The  block  diagrams  in  Figure  2.1  illustrate  the  implementation  of  (2.68)  and  (2.69). 


yt 


In  A  {&r-  {X  t},T) 


(a) 


(b) 

Figure  2.1:  Implementation  of  (a)  approximation  (2.68)  and  (b)  approximation  (2.69). 
In  (b),  the  nonlinear  mapping  <F*  (•)  is  defined  as  4>*  (•)  =  ln<I?Q  (•). 


In  order  to  obtain  a  condition  under  which  (2.68)  and  (2.69)  are  useful,  we 
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rewrite  (2.17)  as 


A  ($};  {At}  ,  T)  =  exp  A tdtj  E 

It  can  be  observed  from  this  expression  that  the  approximation  (2.67)  is  equivalent 
to  approximating  eJ'^‘  ~  1.  Let  bt  be  given  by  (2.27)  and  gmax  be  defined  such 
that  Pr{|qy|  >  qmax}  —  0.  Then,  the  approximation  e’9^  ~  1  is  justihed,  if  for 
every  t  G  [0,  T]  and  every  \9\  <  qmax ,  we  have 


exp 


'»-|-oo 


A te0Zte>0*tdPq  (6)  dt 


f *Vt 


3 

4 


{e\)2 «  i 


~  92bt  <  1. 


(2.70) 

(2.71) 


It  is  straightforward  to  verify  that  both  conditions  (2.70)  and  (2.71)  are  satisfied  if 

7T2  (t)  dt  <  1 .  (2.72) 

Remark  2.6.1.  As  mentioned  earlier,  the  first  stage  for  implementing  A  (•)  is  a 
linear  filter  which  has  an  impulse  response  with  duration  e.  Since  the  bandwidth 
of  this  filter  is  roughly  1/e,  the  effective  power  of  the  white  noise  (thermal  noise) 
is  2 <r2/e.  For  simplicity  of  discussion,  assume  that  qk  =  q  is  a  deterministic  value. 
Then,  (2.72)  can  be  expressed  as 

\  fir2  (; t )  dt  <  ^  ■  (2-73) 
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The  interpretation  of  (2.73)  is  that  the  average  power  of  a  single  pulse  must  be 
much  smaller  than  the  effective  power  of  the  white  noise.  Note  that  (2.73)  does  not 
necessarily  require  a  small  signal-to-noise-ratio  (SNR),  since  with  a  large  rate  A t,  we 
can  maintain  a  large  SNR,  while  satisfying  (2.73). 

We  can  improve  (2.68)  and  (2.69)  by  approximating 


(zt  +  j£t)  ~  <f>9  (zt)  +  0q  (zt)  £t  (2.74) 


which  is  equivalent  to 

e?d^  ~  1  +  jOZt- 

For  this  approximation  to  be  valid,  in  addition  to  (2.70),  the  condition 


E 


(0Zt  -  sm ezt)2  -  ^  (e%y  <  i 


12 


(2.75) 


must  be  satisfied  for  t  e  [0,  T]  and  \9\  <  qmax ■  A  unified  condition  that  satisfies 
both  (2.70)  and  (2.75)  is  obtained  as 

^  (^|f  ^°°7r2(t)df)  <1 


which  is  less  restrictive  than  (2.72). 

In  order  to  determine  an  approximation  for  A  (•)  based  on  (2.74),  we  substi- 
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tute  (2.74)  into  (2.17)  to  find 


A  (&T\  {At}  ,  T)  ~  exp  (  I  (zt)  dt  )  E 


exp  (  j  /  A*$'  (zt)  itdt 


JVt 


(2.76) 


The  second  integral  in  (2.76)  can  be  written  as 


A t&a  {zt)  £tdt  =  /  xtd£t 


(2.77) 


where  xt  is  defined  as 


1  fT 

xt  —  —  7 t  (t  —  t)  At<E{  (zt)  dr.  (2.78) 

°  Jo 

Note  that  xt  can  be  implemented  using  a  causal,  linear,  time-varying  system  with 
the  impulse  response 


g  (■ t ,  r)  =7 t  (t  —  r)u  (r)  u(T  —  r) . 


Let  zt  and  xt  be  the  sample  paths  of  {zt}  and  {^t},  respectively,  noting  that  xt 
is  associated  with  zt  through  (2.78).  Then,  for  the  sample  path  zt,  the  left  side 
of  (2.77)  is  a  zero-mean  Gaussian  random  variable  with  a  variance  JQT  x\.  Thus, 
noting  that  {zt}  and  { xt }  are  independent  of  {G},  we  can  write 


E 


exp 


j  /  A t&  (zt)£tdt 


(2.79) 
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Since  {zt}  and  {xt}  are  smooth  stochastic  processes  with  (almost  surely)  continuous 
sample  paths,  in  (2.79),  we  can  replace  the  sample  paths  zt  and  xt  with  the  sto¬ 
chastic  processes  {zt}  and  {xt},  respectively.  Using  this  fact  and  substituting  (2.79) 
into  (2.76),  we  obtain 


A(^T;{M,T)~exp 


(2.80) 


The  implementation  of  this  approximation  is  illustrated  in  Figure  2.2. 


Figure  2.2:  Implementation  of  (2.80).  Here,  the  impulse  response  g(t,r )  is  defined  as 
H^t)  =  9(t-e,r-e). 


In  order  to  extend  (2.80)  to  the  case  of  a  random  a,  we  define  the  stochastic 
processes 


r)  sT$'q  (zT)  dr 
t)  ii&q  (zT)  dr 


where  st  is  a  deterministic  function.  Next,  substituting  xt  =  axt  +  xt  into  (2.80) 
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and  using  (2.14),  we  get 


A«  {®t\  {st}  ,  T)  ~  exp 


■Fn 


(Zt)  ~  2  Xt  )  dt 


(st$q(zt)  -  xtxt)dt ,  /  x2dt 


(2.81) 


where  Fa(-,-)  is  defined  as 


Fa  {vuvz) 


dPa  (a) . 


The  implementation  of  this  approximation  is  illustrated  in  Figure  2.3. 


Figure  2.3:  Implementation  of  (2.81).  In  this  block  diagram,  we  have  g(t,r)  = 
g(t-e,r  -  e)  and  F*  (vi,v2)  =  lnFa  (vi,v2). 
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2.6.2  Approximation:  Category  II 


The  second  category  of  approximation  we  purpose  for  A  (•)  is  based  on  Theo¬ 
rem  2.5.2.  Using  the  results  of  this  theorem,  we  argue  that  for  a  small  e,  we  can 
approximate 

A  {Aj}  ,T)~B  {At}  ,  T)  (2.82) 


where  B  (•)  is  determined  by  solving  the  integral  equation  (2.56)  and  using  (2.57). 
In  order  to  establish  a  condition  under  which  (2.82)  is  a  close  approximation,  we 
focus  on  the  proof  of  Theorem  2.5.1.  This  proof  indicates  that  for  a  nonzero  e, 
the  claim  of  the  theorem  is  approximately  valid,  if  the  sample  paths  of  {zt}  are 
free  from  overlapping  pulses.  This  is  equivalent  to  having  a  small  probability  for 
occurrence  of  more  than  one  pulse  in  an  interval  with  duration  e.  We  know  from  the 
properties  of  Poisson  process  that  this  condition  is  satisfied  if  we  have  e2Xf  (t)  <C  1 
for  i  —  1,2,..  .M  and  every  t  6  [0,  T],  The  structure  of  a  system  which  determines 
B  (^t;  {Aj}  ,  T )  is  illustrated  in  Figure  2.4. 


Ut 


B  ($r;  {At}  ,  T) 


Figure  2.4:  Structure  of  a  system  which  determines  B  ($^;{At}  ,T)  by  solving  (2.56).  In 
this  block  diagram,  we  have  rj  =  exp  (  —  J0T  A tdtj  and  7  (i,  r)  =7  (t  —  e,  t  —  e). 
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Chapter  3 


Estimation  and  Control  with  Space-Time  Point  Process 
Observations 

3.1  Introduction 

The  concern  of  this  chapter  is  to  estimate  the  state  of  a  stochastic  dynamical  model 
which  modulates  the  rate  of  a  space-time  point  process.  In  addition,  an  associated 
optimal  control  problem  will  be  discussed  which  has  a  direct  application  in  the 
optical  beam  tracking  aspect  of  free-space  optics.  Sections  3.2  and  3.3  present 
the  prior  work  by  Snyder  and  Fishman  [39]  and  Rhodes  and  Snyder  [20],  with  the 
intention  of  providing  the  necessary  background  for  the  next  chapters.  In  Section  3.4, 
we  introduce  a  new  formulation  of  the  problem  which  is  useful  in  generalizing  the 
results  of  [39,  20].  An  approximation  method  will  be  developed  in  Section  3.5  which 
incorporates  the  results  of  [39,  20]  to  explore  a  suboptimal  control  law  for  an  optical 
beam  tracking  system  with  a  finite  resolution  position-sensitive  photodetector. 
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3.2  The  Model  and  Problem  Statement 


In  this  section,  we  present  a  stochastic  model  in  which  the  state  of  a  linear  sto¬ 
chastic  state-space  equation  modulates  the  rate  of  a  space-time  point  process  which 
is  regarded  as  the  system  output.  In  our  description  of  the  model,  we  closely  fol¬ 
low  [39,  20].  For  a  rigorous  and  complete  treatment  of  the  space-time  point  process 
we  refer  the  reader  to  [40].  After  introducing  the  model,  we  state  its  associated 
estimation  and  control  problems. 

3.2.1  The  Model 

Consider  the  stochastic  linear  dynamical  model 

dxt  =  Atxtdt  +  Btutdt  +  Dtdwt  (3.1) 

where  xt  G  Rn  and  Ut  G  Mfc  are  random  vectors  standing  for  state  and  control,  is 
a  p-dimensional  standard  Wiener  process,  and  At,  Bt ,  and  Dt  arc  uniformly  bounded 
deterministic  matrices  with  proper  dimensions.  The  initial  state  x$  is  a  Gaussian 
vector  with  mean  Xq  and  covariance  matrix  S0  and  is  independent  of  {w^}. 

The  observation  of  the  state  vector  is  provided  via  a  space-time  point  process 
defined  over  [0,  oo)  x  A,  where  A  C  Mm ,  m  ^  n.  Each  point  of  this  process  is 
characterized  in  terms  of  a  temporal  component  f*  G  [0,  oo)  and  a  spatial  component 
rt  G  A.  Let  the  nonnegative  scalar  map  A t(r,xt)  which  is  defined  over  t  G  [0,  oo) 
and  r  G  A  and  is  parameterized  by  the  state  vector  xt,  be  the  rate  associated  with 
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this  process.  Then,  the  space-time  point  process  is  (statistically)  characterized  as 
follows. 

Let  N  (T  x  S)  denote  the  number  of  points  occurring  in  T  xS,  where  T  and  S 
are  Borel  sets  in  [0,  oo)  and  A,  respectively.  Associated  with  N(T  x  S ),  define  the 
random  variable 

A  (T  x  S)  =  /  At  (r,  xt)  dtdr. 

Jtxs 


Then,  conditioned  on  A  (T  x  S),  the  random  variable  N  (T  x  S)  is  Poisson  distrib¬ 
uted,  i.e., 


Pr  {N(T  xS)  =  n\A  (T  x  S)}  = 


e-A(TxS)An  x 


Til 


Moreover,  for  disjoint  T\  xS\  and  T2xS2,  conditioned  on  A  (7j  x  d>i)  and  A  (T2  x  S2), 
the  random  variables  N  (T\  x  5i)  and  N  ( T2  x  S2)  are  statistically  independent. 

We  shall  assume  that  the  rate  of  the  space-time  process  has  the  form 


At  (r,  xt)  =  ntf,  (r  -  Ctx ,) 


(3.2) 


where  Ct  is  a  bounded  m  x  n  matrix,  the  known  function  p,t  is  nonnegative  for 
every  t  ^  0,  and  the  nonnegative  map  7t  (•)  :  ffi  x  Mm  — >  M  satishes 


7 1  (r)  dr  =  1. 


(3.3) 
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In  particular,  we  are  interested  in  the  case  that  7 1  (•)  is  a  Gaussian  map,  i.e., 

It  (r)  =  (r;  0,  Rt)  (3.4) 

where  Rt  =  Rj  is  a  mxm  positive  definite  matrix  and  :  MfcxMfcxMfcxfc  — >  M,  k  = 
1,2,3,...  is  defined  as 

^  (z;  z,  0)  =  (27t yk/2  (det  @)~i/2  exp  (z  -  z)T  @_1  (z  -  z)^j  .  (3.5) 

3.2.2  Problem  Statement 

Let  (D,  3F ,  P)  be  the  underlying  probability  space  for  the  stochastic  model  of  Sec¬ 
tion  3.2.1.  Define  38t  as  the  a- algebra  generated  by  the  space-time  point  process 
over  [0,  t)  and  assume  that  the  control  vector  ut  is  ^-measurable.  We  say  ut  is  an 
admissible  control  if  it  is  measurable  and  the  solution  to  (3.1)  is  well-defined. 
Subject  to  the  model  of  Section  3.2.1,  we  define  the  following  problems. 

Estimation  Problem:  For  every  t  >  0,  determine  pXt  (x| 38 t),  the  posterior  den¬ 
sity  of  the  state  vector  xt  given  38t.  In  particular,  determine  the  condi¬ 
tional  expectation  xt  —  E  [xt  \ 38 1]  and  the  conditional  covariance  matrix  E4  = 
E  (xt  -  xt)  (xt  -  xt)T  1 38t  . 

Control  Problem:  Find  an  admissible  control  {lit,  t  G  [0,  T]}  which  minimizes  the 
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cost  functional 


J 


E 


Lyo 


(xj Qt.Xt  +  uJPtUt)  dt  +  x^pSxx 


(3.6) 


where  Pt  =  Pf  >  0,  Qt  —  Qj  ^  0,  and  S  =  ST  ^  0  are  matrices  of  proper 
dimensions  and  T  >  0  is  a  fixed  time  horizon. 


3.3  Relevant  Prior  Work 

In  this  section,  we  state  the  results  obtained  in  [39,  20]  regarding  the  estimation 
and  control  problems  defined  in  Section  3.2.2.  These  results  provide  an  adequate 
framework  for  our  discussion  in  the  next  chapters.  Theorem  3.3.1  below  provides  a 
solution  to  the  estimation  problem  in  the  most  general  case.  The  rest  of  theorems 
address  the  special  case  in  which  the  rate  of  the  space-time  point  process  is  given 
by  (3.2)  and  (3.4). 

Before  stating  the  theorems,  we  fix  notation.  Let  (tk-i,tk]  be  the  interval 
between  two  successive  occurrence  of  the  space-time  point  process  and  r*,  be  the 
location  of  kth  occurring  point.  Assume  that  ht  (r,  ^)  is  continuous  in  r  and  left 
continuous  in  t  and  Then,  the  stochastic  differential  equation 

d£t  =  [  fh  (r,  ft)  N  (dt  x  dr) 

J  A 

is  defined  such  that  d£t  =  0  during  (tk_ i,  ffc)  and  £t  encounters  a  jump  of  htk  (ry,  ^tfc) 


63 


ctt  t  —  tfci  i.e., 

=  ^k +ht*  (rk’&k)  ■ 

Theorem  3.3.1  (Rhodes  and  Snyder  1977  [20]).  Consider  the  state-space  equa¬ 
tion  (3.1)  and  its  associated  space-time  observation  defined  in  Section  3.2.1.  Assume 
that  the  increasing  family  of  cr-algebras  2$t  are  given  and  that  Ut  is  ^-measurable. 
Then,  the  posterior  density  of  the  state  vector  xt  is  the  solution  of  the  stochastic 
partial  differential-integral  equation 


dpxt  (x\&t)  =  £  {pxt  (x| 3§t)}  dt  +  pXi  (x\3$)t)  J  (  At  (r,  x)  Xt  (r)  -  1 )  N  ( dtxdr ) 


Pxt  (x\@t)  J  (At  (r,  x)  -  At  ( r))drdt 


(3.7) 


where  Xt  ( r )  =  E  [At  (r,x t)  \&t]  and  £{•}  is  the  forward  Kolmogorov  operator  asso¬ 
ciated  with  (3.1)  defined  as 

n  1  n  n 

£{•}  =  -  9  [(Atx  +  Btut)  (■)]*/&&*  +  -  MOlj,  /dx'dxP 

2=1  2=1  j  =  1 

Proof.  See  [39,  20],  □ 

Corollary  3.3.1  (Rhodes  and  Snyder  1977  [20]).  Assume  that  A  =  Mm  and  Xt  (r,  xt) 
is  given  by  (3.2).  Let  {pt}  be  a  nonnegative  stochastic  process  which  is  statistically 
independent  of  Xq  and  {wt}.  Then,  under  the  assumptions  of  Theorem  3.3.1,  the 
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posterior  density  pXt  (x\88t)  is  the  solution  of 


dpxt  (x\ 88t)  =  C  {pxt  (x|  88t)}  dt  +  pxt  (x\88t)  /  (it  (r  -  Ctx )  %  1  (r)  -  l)  N  ( dtxdr ) 

J  Rm 

(3.8) 

where  7 1  (•)  is  defined  as  7*  (r)  =  E  [74  (r  —  Ctxt)  \88t]. 


Proof.  Following  [20],  we  temporarily  replace  88t  with  88[  =  88 t  U  where  ^ 

is  the  cr-algebra  generated  by  {/xf}  over  [0, 00).  Then,  Theorem  3.3.1  indicates 
that  pXt  (x\88't)  is  the  solution  of  (3.7)  with  A t(r,xt)  replaced  from  (3.2)  and  88 t 
replaced  with  88\.  From  condition  (3.3),  it  is  easy  to  verify  that  the  second  integral 
on  the  right  side  of  this  equation  is  identically  zero,  which  leads  to  (3.8)  with  88t 
replaced  by  88't.  Since  this  equation  does  not  depend  on  {pt},  we  can  replace  88[ 
with  88 1  to  show  that  pXi  (x\88t)  satisfies  (3.8).  □ 


Theorem  3.3.2  (Rhodes  and  Snyder  1977  [20]).  Let  (•)  be  the  Gaussian  map  (3.4). 
Then,  under  the  assumptions  of  Corollary  3.3.1,  the  posterior  density  pXt  (x\88t)  is 
Gaussian,  i.e., 

Pxt  (x\88t)  =  $n(x\xt,Et). 


Here,  the  conditional  expectation  xt  and  the  conditional  covariance  matrix  Sf  are 
the  solutions  of  the  stochastic  differential  equations 


dxt 


Atxtdt  +  Btutdt  + 


Ctxt)  N  ( dt  x  dr) 


dSt  =  AtYjtdt  +  TjtAjdt  +  DtDjdt  —  MtCtT>tdNt 


(3.9a) 

(3.9b) 
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with  the  initial  states  x0  =  x0  and  £0  =  E0,  where  Nt  —  N  ([0,  t)  x  Mm)  and  Mt  is 
dehned  as 

Mt  =  EtC)r  (CtHtCj  +  Rtf1 . 

Moreover,  the  conditional  covariance  matrix  Et  is  almost  surely  positive  definite 
for  t  >  0,  provided  that  So  is  positive  definite. 

Proof.  We  outline  the  proof  and  refer  the  reader  to  [39]  for  the  details.  Let  t\  be  the 
first  occurrence  time  of  the  space-time  point  process.  During  t  G  [0,  ti) ,  the  integral 
on  the  right  side  of  (3.8)  is  zero  and  the  equation  is  reduced  to 

S-^^  =  C{pM)}-  (3.10) 

The  solution  to  this  equation  is  a  Gaussian  function  with  mean  xt  and  covariance 
matrix  T,t.  In  the  transition  from  tf  to  tf ,  the  integral  on  the  right  side  of  (3.8) 
causes  pXt  (x\3§t)  to  jump  from  <3>n  into  <Ln  xt+,  Et+  j .  Continuing 

this  procedure,  we  find  that  pXt  (x\£§t)  is  Gaussian  for  every  t  ^  0. 

To  prove  the  second  statement,  we  note  that  E t  is  positive  definite  during  t  G 
[0,  fi),  since  in  equation  (3.9b),  the  initial  state  E0  and  DtDf  are  positive  definite. 
Also,  at  t  ~  t\  we  have 


E,r  -  Elr<y  (c^E.-Cj  +  R„y'  C„ E,r 

(e -1+Cjiit-101)"1 


which  indicates  that  £t+  is  positive  definite.  The  proof  can  be  completed  by  con- 
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tinning  this  procedure. 


□ 


Theorem  3.3.3  (Rhodes  and  Snyder  1977  [20]).  Under  the  assumptions  of  The¬ 
orem  3.3.2,  the  admissible  control  u*t  which  minimizes  the  cost  functional  (3.6)  is 
uniquely  given  by 

u*  =  —PflBtKtxt  (3.11) 

where  the  n  x  n  nonnegative  definite  matrix  Kt  is  the  backward  solution  of  the 
Riccati  equation 


kt  =  —KtAt  -  AjKt  +  KtBtp-lBjKt  -  Qt 

with  the  terminal  condition  Kt  =  S.  The  minimum  of  the  cost  functional  J  asso¬ 
ciated  with  (3.11)  is  given  by 

J*  =  E  [x^KoXo]  +  f  tr  ( Kt Bt Pf 1  Bj KtE  [St]  +  KtDtD T)  dt. 

Jo 

Proof.  According  to  [20],  J  can  be  expressed  as 

J  =  E 

+  E  [xlK0x0]  +  [  tr  {KtBtp-x Bj KtE  [Et]  +  KtDtDTt )  dt. 

Jo 

It  is  easy  to  verify  that  the  second  and  the  third  terms  on  the  right  side  of  the 
expression  above  do  not  depend  on  ut,  thus  we  must  minimize  the  first  term.  This 
term  is  always  nonnegative  with  a  minimum  of  zero  which  is  achieved  by  choosing  ut 


IJo 


(ut  +  Pt  1BtKtxt)T  Pt  (ut  +  Pt  lBtKtxt )  dt 
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as  (3.11). 


□ 


3.4  A  New  Formulation  for  the  Estimation  Problem 

Throughout  this  section,  we  assume  that  \t  (r,xt)  is  given  by  (3.2)  and  7*  (•)  satis¬ 
fies  (3.3),  while  it  is  not  necessarily  Gaussian.  Further,  we  assume  that  A  =  Mm, 
the  pair  of  (At,  Dt)  is  controllable,  and  the  ^-measurable  ut  is  a  piecewise  con¬ 
tinuous  and  almost  surely  bounded  stochastic  process.  Under  these  assumptions, 
we  determine  the  solution  of  the  estimation  problem  in  terms  of  the  posterior  den¬ 
sity  associated  with  a  discrete-time  model.  The  procedure  for  obtaining  this  new 
representation  is  explained  below. 

Lemma  3.4.1.  Consider  the  stochastic  differential  equation 

dxt  =  Atxtdt  +  Dtdwt  (3-12) 

and  assume  that  for  the  fixed  but  arbitrary  t*  ^  0,  xt*  is  independent  of  {wt,  t  ^  t*} 
and  its  probability  density  function  is  known.  Then,  for  every  t  ^  t*,  the  probability 
density  function  pit  (x)  is  given  by 


Pxt  (A 


(x;$‘4  (t,f  )x*,W(t,t*))piit  (x*)  dx*. 


(3.13) 


Here,  <f>A  (t,  r)  is  the  transition  matrix  associated  with  At  which  satisfies 


dt 


=  (t,r) 


(r,r)  =  /n 


for  every  t  ^  r  and  W  (t,t*)  is  defined  as 


w{t,t*)  =  j  $A(t,T)DTDT($A(t,T))TdT. 


Proof.  We  note  that  for  every  t  ^  t*,  the  solution  of  (3.12)  is  given  by 


(3.14) 


x,  -  4' 1  U.  I'  )  (t,  r)  DTdw7 

Jt* 


The  two  random  vectors  on  the  right  side  of  this  expression  are  independent  and  the 
second  one  is  a  zero-mean  Gaussian  random  vector  with  covariance  matrix  W  (t,  t*). 
This  leads  to  the  convolution  described  by  (3.13).  □ 

Consider  the  linear  discrete-time  state-space  model 


^fc+i  T  (dkLUk  (3.15a) 

pk  =  Lk6k  +  uk  (3.15b) 


where  6k  G  Mn  is  the  state  vector  and  the  i.i.d.  random  vectors  uk  G  Mn,  k  = 
1,  2,  3, . . .  are  zero-mean  and  Gaussian  with  covariance  matrix  Inxn ■  The  matrices  Fk 
and  Lk  in  (3.15)  are  defined  as  Fk  =  &A  (tk+i,tk)  and  Lk  =  Ctk,  respectively, 
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where  t0  —  0  and  tk,  k  =  1,2,3,...  is  the  kth  occurrence  time  of  the  space-time 
point  process.  The  n x  n  matrix  Gk  is  dehned  in  such  a  manner  that  GkGf  = 
W  (4+i,  4)-  The  random  vector  vk  G  is  distributed  according  to  the  probability 
density  pVk  (u)  =  7 tk  (u)  and  uk  and  /y  are  independent  for  k  l.  The  initial 
state  6q  is  a  Gaussian  random  vector  with  mean  x$  and  covariance  matrix  So  and  is 
independent  of  {u>k}  and  {14.}.  We  denote  the  history  of  the  measurement  vector  pk 
(up  to  k)  by  &k  =  {pi,  p2,  •  •  • ,  pk}  ,  k  =  1,2,3,.. .,  and  =  0- 


Theorem  3.4.1.  Suppose  that  the  measurement  vector  pk  in  (3.15b)  is  generated 
according  to  pk  =  rk  —  Ctkvtk,  where  rk  is  the  location  of  the  event  occurred  at  t  —  4 
(associated  with  the  space-time  point  process)  and  the  stochastic  process  {vt,  t  ^  0} 
is  dehned  as 


vt  =  /  (t,  t )  BTuTd,T. 

Jo 


Then,  for  t  G  [4,4+i))  &  =  0,1,2,...,  the  posterior  density  pXt(x\3St)  can  be 


expressed  as 


pXt(x\&t)=[  $n  (x;$A  (t,tk)0 +  vt,W  (t,tk))  pdk  (9\&k)d9  (3.16) 

J  i" 

where  pgk  (9 \Mk)  is  the  posterior  density  of  the  state  vector  9k  in  (3.15a),  conditioned 
on  Mk. 

Proof.  Let  xt  =  xt  —  vt  be  the  solution  of  (3.12)  and  assume  that  pxt  (x\&t)  is  known 
at  t  —  4-  For  the  time  interval  t  G  [4,  4+i)  hr  which  (3.8)  reduces  to  (3.10),  we  use 
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Lemma  3.4.1  to  obtain 


Pxt(x\&t)=  f  $„  (x]$A(t,tk)x*,W  (t,tk))pxtk  (x*\@tk)dx*.  (3.17) 

J  i" 

Also,  from  xt  =  xt  +  vt  and  the  fact  that  vt  is  measurable  with  respect  to  &t,  we 
have 


pXt{x\&t)=  [  (x;  $A  (t,tk)  x*  +  vt,  W  (t,tk))  Pxtk  {x*\3Stk)  dx*.  (3.18) 

J  R" 


At  t  —  tk+l,  (3.17)  can  be  expressed  as 


Px 


°k+ 1 


<hn  (5;  Fkx *,  GkGk )  pitk  {x*\&tk)  dx*.  (3.19) 


We  solve  (3.8)  between  tk+1  and  tk+ 1,  in  order  to  get 


P**k+1  (x\^tk+1) 


Pxt]  (x\^tk+1)  7tfc+i  (n+ 1  -  Cifc+1a:) 
P*t-+i  (x*l'%+1)  7tfc+i  (r-fc+i  -  Ctk+1x*)  dx* 


Using  this  result,  the  equality  xt  =  xt+vt,  the  continuity  of  vt ,  and  rk+i—Ctk  vtk  = 
Pfc+i,  we  obtain 


Pxtk+1  (x\  &tk+1) 


(®l  ^t~+1)  Pvk+i  (Pk+i  ~  Lk+ix) 

(**l%+i)  P«*+1  (Pfc+i  -  ^fc+i^*) 


(3.20) 


The  recursive  formulas  (3.19)  and  (3.20)  specify  a  two-step  procedure  for  determin¬ 
ing  Pxtk+1  {x\&tk+1)  hr  terms  of  pxtk  (x\&tk).  We  shall  show  that  the  same  procedure 
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can  be  used  to  determine  Pek+1  (9\&k+i)  in  terms  of  pdk  (9\&k). 
Using  the  law  of  total  probability,  we  can  write 


Pek+1  {e\&k)  =  /  pek+1  {e\ek  =  e*,^k)Pdk  {e*\&k)  dd* 

J  Rn 


(3.21) 


where  the  second  equality  is  obtained  from  (3.15a),  noting  that 


Pek+ 1  (0\0k  =  9*,&k)  =  Pek+1  ( 9\9k  =  6*) 


Also,  from  Bayes’  rule  we  obtain 


Pek+1  (0  |^fc+i)  =  Pek+1  {0\^k,Pk+i) 


Pdk+1  (d\&k)  Ppk+i  (Pfc+l  |^fc+ 1  9,&k) 

Pek+1  {9*\&k)pPk+1  (pk+l\9k+1  =  9*,&k)d9* 

Pek+1  {9\&k)Puk+1  (Pfc+i  -  Lk+19 ) 

Pek+1  {9*Wk)Pvk+1  (pk+ i  -  Lk+19*)  d9* 


(3.22) 


where  the  last  equality  is  derived  from  (3.15b).  Comparing  the  pair  of  formulas 
(3.21)  and  (3.22)  with  (3.19)  and  (3.20),  we  conclude  that  pitk  (9 \3§tk)  =  pok  (9\&k), 
k  =  0, 1,  2, . . .,  which  leads  to  (3.16)  upon  substituting  into  (3.18).  □ 

For  the  special  case  that  uk,  k  =  1,  2,  3, . . .  is  a  Gaussian  random  vector, 
the  estimation  problem  associated  with  the  discrete-time  model  (3.15)  has  an  exact 
Gaussian  solution,  which  is  consistent  with  the  results  of  [39,  20].  For  the  gen- 


72 


eral  case,  this  estimation  problem  is  difficult  to  solve,  i.e.,  the  problem  is  infinite¬ 
dimensional.  While  it  seems  that  Theorem  3.4.1  converts  a  hard-to-solve  problem 
into  another  hard-to-solve  problem,  the  new  formulation  might  be  easier  to  ap¬ 
proach,  due  to  the  discrete-time  and  linear  nature  of  the  model. 

3.5  Optical  Beam  Tracking 

In  Section  1.2.1,  we  briefly  discussed  the  operation  of  optical  beam  tracking.  This 
operation  has  been  studied  by  Snyder  [19]  in  a  stochastic  framework  using  an  ide¬ 
alized  model  for  the  photodetector.  In  that  work,  the  dynamics  of  the  pointing 
assembly  and  the  relative  motion  is  modeled  by  (3.1),  where  Ut  is  the  (control)  in¬ 
put  of  the  pointing  assembly.  Also,  the  location  of  the  center  of  spot  of  light  is 
modeled  by  Ctxt  and  its  optical  intensity  is  described  by  (3.2).  In  addition,  [19] 
considers  the  Gaussian  model  (3.4)  for  the  intensity  pattern  of  the  spot  of  light1. 
Regarding  the  position-sensitive  photodetector,  [19]  makes  two  ideal  assumptions: 
the  photodetector  has  an  infinite  spatial  resolution  and  an  infinite  area  {A  =  M2). 
The  first  assumption  allows  us  to  model  the  output  of  the  photodetector  by  a  space- 
time  point  process  with  rate2  Xt  (r,xt)  in  (3.2)  and  the  second  one  makes  it  possible 
to  use  the  results  of  Theorems  3.3.2  and  3.3.3.  Finally,  the  problem  of  optical  beam 
tracking  can  be  formulated  in  terms  of  minimizing  (3.6)  with  Qt  =  Cj Ct .  For  a 
detailed  derivation  of  this  model  see  Chapter  5. 

In  a  more  realistic  model,  while  keeping  (3.1),  (3.2),  and  (3.4),  we  describe 

xTo  evaluate  the  validity  of  this  assumption  see  Chapter  5 

2Here,  the  background  radiation  and  the  dark  current  noise  are  ignored. 
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the  output  of  the  position-sensitive  photodetector  by  a  point  process  vector.  For 
this  purpose,  let  A\  i  =  1,2 , . . .  ,q  denote  the  ith  partition  on  the  surface  of  the 
photodetector  such  that  (J?=1  ^  =  A.  The  output  of  the  region  A'  will  be  modeled 
by  a  doubly  stochastic  Poisson  process  Ytl  with  rate  A\  ( xt ),  where  A)  (•)  :  RxM"  — >  R 
is  defined  as 

A]  ( x )  —  X t  (r,  x )  dr. 

J Ai 

To  have  a  compact  notation,  we  put  Yt\  i  =  1,  2, . . . ,  q  in  a  vector  Yt  and  express 
the  output  of  the  position-sensitive  photodetector  by  Yt  =  (Ytl ,  Yt2, . . . ,  Ytq). 

In  order  to  solve  the  optimal  control  problem  associated  with  this  new  model, 
we  need  to  obtain  an  equation  which  describes  the  temporal  evolution  of  the  poste¬ 
rior  density  (similar  to  (3.7)).  The  filtering  problem  associated  with  this  equation 
is  infinite-dimensional,  which  requires  some  sort  of  approximation  to  reduce  it  into 
a  finite-dimensional  problem.  An  alternative  to  this  approach  is  to  start  from  the 
idealized  model  explained  above  and  derive  an  appropriate  approximation  from  the 
“exact”  results  of  Theorem  3.3.2  and  Theorem  3.3.3.  To  justify  this  approach,  we 
note  that  the  “infinite  resolution”  assumption  provides  a  reasonable  approximation 
for  a  high  spatial  resolution  photodetector.  Also,  the  “infinite  area”  assumption  is 
appropriate  when  the  photodetector  area  is  significantly  larger  than  the  size  and  the 
displacement  of  the  spot  of  light. 

Following  this  approach,  we  approximate  the  optimal  control  associated  with 
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the  “finite  resolution”  model  as 


ut  =  -Pt  1BtKtxt 

where  xt  is  the  solution  of  the  stochastic  differential  equations 

g 

dxt  =  Atxtdt  +  Btutdt  +  E  Mt  (rt  -  Ctxt)  dY?  (3.23a) 

2—1 

q 

dYt  =  AtY,tdt  +  YtAfdt  +  DtDjdt  —  MtCtYtdY 7  (3.23b) 

2=1 

with  the  initial  state  Xq  =  Xq  and  S0  =  S0.  Here,  r\  G  A'  is  a  representative  point 
of  the  region  A  and  Mt  is  defined  as 

Mt  =  ttCf{CtttCj  +  Rty\ 

We  note  that  Ytl  is  the  integral  over  A  of  the  space-time  point  process,  i.e., 
each  point  (event)  of  this  process  which  occurs  on  A  increases  the  value  of  Ytl  by 
one  unit.  The  information  we  lose  by  replacing  the  space-time  point  process  with  Yt 
is  the  knowledge  of  the  exact  occurrence  location  of  the  points  on  A.  In  fact,  we 
derived  (3.23a)  from  (3.9a)  by  replacing  the  exact  occurrence  location  of  the  points 
with  r\  as  a  representative  point  of  A.  This  suggests  that  to  achieve  the  best 
performance  of  the  estimator  (3.23),  r\  must  be  chosen  as  a  “good”  estimate  of  the 
occurrence  locations,  based  on  the  past  observation  of  Yt. 

Based  on  the  explanation  above,  an  appropriate  choice  for  r\  is  the  minimum 
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mean  squared  error  (MMSE)  estimator 


rE  [$m  (r;  Ctxt,  Rt)\ %]  dr 
E  [<S>m(r-,Ctxt,Rt)\&t]  dr 


I  A1 


where  ^  is  the  a- algebra  generated  by  Yt  over  [0,f).  Approximating  pXt  (a:|^) 
with  &n(x-,xt,  E t),  the  expression  above  can  be  written  as1 


i  _  JA_ 

I  +  —  /» 


r<hm  r;  C'txt,  At  +  CfYtCt  )  dr 


I  Ai 


( r;  C'jXj,  i?f  +  CtTjtCf )  dr 


Another  suitable  choice  for  r\  is  the  maximum  a  posteriori  (MAP)  estimator 


r\  =  arg  max  d>r 

rG-4; 


r]  Ctxt,  Rt  +  CtYtCl 


When  the  partition  A1,  A2, . . . ,  Aq  is  fine  enough,  for  sake  of  simplicity,  r\  can  be  a 
predetermined  point  of  A1  such  as 


r,  = 


M* 


r(I>m  (r;0,Rt)  dr 


/  <f>m  (r;  0,  At)  dr 
I  Ai 


1See  Lemma  4.5.1. 


76 


Chapter  4 


Active  Pointing 

4.1  Introduction 

In  this  chapter  we  study  the  estimation  and  control  problems  defined  in  Section  3.2.2 
for  the  case  of  A  ^  Mm.  Since  under  this  assumption,  the  associated  filtering 
problem  is  infinite-dimensional,  we  focus  our  attention  on  an  approximate  solution 
for  the  problem,  which  leads  to  a  suboptimal  estimator  and  controller. 

The  motivation  for  this  study  is  its  application  in  an  active  fine  pointing  scheme 
for  short  range  free-space  optical  communication.  The  one-way  optical  link  under 
consideration  comprises  an  optical  transmitter  and  an  optical  receiver  which  are  sub¬ 
ject  to  relative  motion.  The  optical  transmitter  is  equipped  with  an  electromechani¬ 
cal  pointing  assembly  which  can  control  the  azimuth  and  elevation  of  a  transmitting 
laser  source.  The  optical  beam  emitted  by  the  laser  source  has  a  nonuniform  in¬ 
tensity  profile  which  is  assumed  to  be  Gaussian  [41],  Normally,  the  aperture  of  the 
receiver  is  smaller  than  the  received  optical  beam,  so  that  the  receiver  can  collect 
only  a  fraction  of  the  optical  power.  In  order  to  enlarge  this  captured  fraction, 
the  goal  of  active  pointing  is  to  hold  the  center  of  the  optical  beam  at  the  center 
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of  the  receiving  aperture.  The  receiver  employs  a  position-sensitive  photodetector 
to  measure  the  intensity  profile  of  the  optical  beam  that  strikes  its  aperture.  The 
output  of  the  photodetector  is  used  to  estimate  the  center  of  the  received  optical 
beam,  which  is  then  conveyed  to  the  transmitter  through  an  optical  link  or  a  low- 
bandwidth  RF  channel.  The  pointing  assembly  then  adjusts  the  orientation  of  the 
transmitter  based  on  this  estimate.  The  concept  of  this  active  pointing  method  is 
illustrated  in  the  block  diagram  of  Figure  1.2. 

The  performance  of  the  proposed  active  pointing  scheme  depends  significantly 
on  the  accuracy  of  the  estimate  of  the  beam  center.  In  order  to  achieve  a  good 
estimate  of  the  beam  center,  it  is  necessary  that  the  size  of  the  receiving  aperture 
be  comparable  with  the  size  of  the  beam.  This  requirement  limits  the  application 
of  the  method  to  short  distance  links. 

The  remainder  of  this  chapter  is  organized  as  follows.  In  the  next  section,  we 
show  that  the  overall  active  pointing  scheme  can  be  described  in  terms  of  the  model 
of  Section  3.2.1  and  its  associated  estimation  and  control  problems  in  Section  3.2.2. 
Sections  4.3  and  4.4  consider  the  estimation  and  control  problems,  respectively. 
Since  the  proof  of  the  theorems  stated  in  these  sections  are  long,  for  sake  of  continuity 
of  discussion,  we  present  the  proofs  in  Section  4.5. 

4.2  The  Model 

Let  the  two-dimensional  vector  9t  denote  the  azimuth  and  elevation  angles  of  the 
transmitter  axis  with  respect  to  some  fixed  coordinate  system.  Similarly,  cq  denotes 
the  azimuth  and  elevation  angles  of  the  line-of-sight  of  the  stations  (passing  through 
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the  center  of  the  receiving  aperture)  with  respect  to  the  same  coordinate  system. 
We  assume  that  the  receiving  aperture  is  held  perpendicular  to  the  line-of-sight  by 
means  of  an  optical  beam  tracking  system.  Then,  for  a  small  pointing  error  9t  —  at, 
the  displacement  of  the  center  of  the  optical  beam  with  respect  to  the  center  of  the 
receiving  aperture  is  given  by  yt  =  l  (0t  —  at),  where  the  known  constant  l  is  the 
distance  between  the  stations.  Figure  4.1  illustrates  the  optical  beam  in  the  plane 
of  the  receiving  aperture  and  the  displacement  vector  yt. 


Figure  4.1:  Receiving  aperture,  optical  beam,  and  the  displacement  vector  yt. 


The  pointing  assembly  is  an  electromechanical  system  with  the  input  vec¬ 
tor  ut  €  M2  and  the  output  vector  6t  €  M2.  We  describe  this  system  by  the  linear 
stochastic  state-space  equations 


dxp  =  Apxpdt  +  Bputdt  +  Dpdwp 
0t  =  Cfxpt 


(4.1) 


where  xp  €  RnP  is  the  state  vector  and  {wp}  is  a  mp-dimensional  standard  Wiener 
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process.  In  this  equation,  Ap,  Bf,  Dp,  and  Cf  are  known  uniformly  bounded  matrices 
of  appropriate  dimensions.  Using  a  linear  model  for  the  pointing  assembly  is  justified 
by  the  fact  that  the  system  operates  over  small  angles  during  the  active  pointing 
regime1. 

We  model  at  by  a  Gauss-Markov  stochastic  process  described  by  the  state- 
space  equations 


dxf  =  Afxfdt  +  Dfdwf 
=  C*x* 


(4.2) 


with  the  state  vector  xf  e  M”d,  md-dimensional  standard  Wiener  process  {wf},  and 
known  uniformly  bounded  matrices  Af,  Df,  and  Cf  of  proper  dimensions. 

The  displacement  vector  yt  —  l(6t  —  af)  is  a  linear  function  of  xf  and  xf,  so 
that  (4.1)  and  (4.2)  can  be  combined  in  a  compact  form 


dxt  =  Atxtdt  +  Btutdt  +  Dtdwt  (4.3) 

yt  =  Ctxt 


with  the  state  vector  xt  €  Mn  and  m-dimensional  standard  Wiener  process  {tat}, 
where  n  =  np  +  nd  and  m  =  mp  +  md.  The  initial  state  xq  is  assumed  to  be  Gaussian 
with  mean  :ro  and  covariance  matrix  S0,  and  independent  of  {wy}. 

Let  r  denote  the  position  vector  of  an  arbitrary  point  on  the  plane  of  the 
receiving  aperture  with  respect  to  a  coordinate  system  centered  at  the  center  of  the 
1For  a  detailed  discussion  of  this  issue  see  Chapter  5. 
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aperture.  Then,  for  a  Gaussian  beam  centered  at  yt  =  Ctxt,  the  optical  intensity 
over  the  plane  of  the  aperture  is  proportional  to  $2  (r  —  Ut]  0,  Rt)  =  $2  (r;  Ctxt ,  i?t), 
where  i?t  =  i? f  is  a  2  x  2  positive  definite  matrix  describing  the  shape  of  the  beam. 
For  a  circular  symmetric  beam  with  a  constant  radius  g  >  0,  Rt  can  be  expressed 
as  Rt  =  g2I2x 2. 

Let  A  denote  the  set  of  points  on  the  receiving  aperture.  In  a  practical  sys¬ 
tem,  the  optical  field  over  the  receiving  aperture  is  focused  on  a  position-sensitive 
photodetector  of  small  surface  area  by  means  of  a  focusing  lens.  The  photodetector 
measures  the  intensity  profile  of  the  imaged  optical  field,  which  is  a  scaled-down 
version  of  the  optical  intensity  over  the  receiving  aperture.  Therefore,  we  consider 
the  combination  of  the  lens  and  the  photodetector  as  a  virtual  photodetector  of 
area  A.  i.e.,  we  assume  that  the  virtual  photodetector  provides  the  observation  of 
the  optical  intensity  over  A. 

Following  [19]  and  our  discussion  in  Section  3.5,  we  use  an  “infinite  resolution” 
model  for  a  high  resolution  photodetector  employed  by  the  receiver.  According  to 
this  model,  we  describe  the  output  of  the  photodetector  by  a  space-time  point 
process  defined  over  A  with  a  rate  given  by 


\t  (r,  xt)  =  Mt® 2  (r;  Ctxt,  Rt) . 


where  fit  is  a  known  nonnegative  function.  We  remind  from  Section  3.2.2  that  38 t 
denotes  the  a- algebra  generated  by  the  space-time  point  process  over  [0,t). 

The  central  objective  of  an  active  pointing  system  is  to  maintain  the  centroid  of 
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the  optical  beam  as  close  as  possible  to  the  center  of  the  photodetector.  This  control 
task  can  be  interpreted  as  one  of  minimizing  yt  with  respect  to  some  appropriate 
norm.  For  this  purpose,  we  adopt  the  quadratic  cost  functional 


J  =  E 


L^o 


(xjQt.Xt  +  uJPtut)  dt  +  x^SxT 


(4.4) 


with  Qt  =  Cj Ct ,  Pt  =  pi 2x2 1  and  S  =  0,  where  p  >  0  is  a  known  constant. 

Our  discussion  up  to  this  point  indicates  that  the  controller  design  for  an  active 
pointing  system  can  be  formulated  in  terms  of  the  control  problem  of  Section  3.2.2 
subject  to  the  model  of  Section  3.2.1.  An  intermediate  step  for  solving  the  control 
problem  is  to  obtain  the  posterior  density  pXt  (x\dSt).  In  the  next  section,  we  discuss 
this  problem  and  develop  an  approximation  for  the  posterior  density.  In  Section  4.4, 
we  employ  this  approximation  in  order  to  determine  an  approximate  solution  for  the 
optimal  control  problem.  Although  for  the  specific  application  of  active  pointing, 
the  space-time  point  process  is  defined  over  A  C  M2,  our  results  can  be  applied  to 
the  general  case  of  A  C  Mm.  Hence,  for  sake  of  generality,  we  present  these  results 
for  an  arbitrary  integer  m. 


4.3  Estimation  Problem 

We  remind  from  Chapter  3  that  the  posterior  density  pXt  {x\33t)  is  the  solution  of 
the  stochastic  partial  differential-integral  equation  (3.7).  For  the  case  that  A  ^  Mm , 
the  filtering  problem  associated  with  this  equation  is  infinite-dimensional;  however, 
when  A  is  large  compared  with  the  size  of  the  optical  beam,  a  finite-dimensional 
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approximation  is  reasonable.  The  fact  that  for  A  =  Mm,  the  posterior  density 
pXt  (a: 1 38 1)  is  Gaussian1,  motivates  us  to  consider  a  Gaussian  approximation  for 
pXt  (a; 1 88 1)  when  A  ^  Mm.  In  the  remainder  of  this  section,  we  develop  a  method  to 
determine  the  mean  and  covariance  matrix  of  such  a  Gaussian  approximation.  The 
cumulant  generating  function  associated  with  pXt  {x\38t)  plays  a  central  role  in  this 
development. 

The  conditional  cumulant  generating  function  of  xt  given  38t  is  defined  as 
A  0)  =  InE  [exp  (jixTxt)  \38t]  \ju=a. 

This  function  can  be  expanded  in  terms  of  the  conditional  cumulants  /q1*2  lj  [42]  as 

OO  1 

4>t  0)  =  7[  K*tl2"'tj SiiSi 2-  '  • Sij  (4-5) 

j= i  4' 

where  JG1  =  {1,  2, ... ,  np  and  s  =  (si,  s2,  •  •  • ,  sn )•  Note  that  xt  and  Et  are  repre¬ 
sented  in  terms  of  the  first  and  the  second  order  cumulants  as  xt  =  (ft1,  Kt >  •  •  • >  Kt  ) 
and  Et  =  [ft)J],  The  temporal  evolution  of  by  (•)  is  described  by  a  partial  differential- 
integral  equation  derived  from  (3.7)  and  is  stated  next. 

Theorem  4.3.1.  Let  (•)  be  the  conditional  cumulant  generating  function  of  Xt 
given  38 t  where  xt  is  the  solution  of  (4.3).  Then,  the  temporal  evolution  of  -f/y  (•)  is 


Tt  is  also  assumed  that  (•)  is  a  Gaussian  map. 
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described  by 


+  BtUt'j  dt  +  ^  sT  DtDj  s  dt 

+  [  (In  (3t  (r,  s)  -  In  A  (r,  0))iV(rfi  x  dr)  -  f  (f3t  (r,  s)  -  fit  (r,  0))drdt 
J  A  JA 


(4.6) 


where  (3t  (•,  •)  is  defined  as 


Pt  {r,  s)  =  exp  {-<0t(j^)}  E  [exp  (ju/ay)  At(r,  xt) \@t]  \joj=a •  (4.7) 


Moreover,  if  the  Fourier  transform  of  A*  (r,  •), 


A t  {r,ju) 


At  (r,  x)  exp  (—juiTx)  dx 


exists,  Pt  (•,  •)  can  be  expressed  as 

Pt  {r,  s)  =  — f  At  (r,ju)  exp {ipt{ju  +  s)  -  ift(s)}dv.  (4.8) 

(4^)  jRn 

Proof.  See  Section  4.5.1.  □ 

The  temporal  evolution  of  the  cumulants  is  described  by  a  (generally  infi¬ 
nite)  set  of  nonlinear  stochastic  differential  equations  driven  by  the  space-time  point 
process  N  (T  x  S).  This  set  of  equations  can  be  derived  from  (4.6)  by  matching 
the  coefficients  of  corresponding  s^s^  -  ■  -sl:j  on  the  two  sides  of  (4.6).  We  usually 
suppose  that  the  first  few  cumulants  approximate  pXt  (x\3§t)  within  an  acceptable 
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precision.  This  means  that  the  infinite  set  of  equations  can  be  approximated  by  a 
finite- dimensional  one. 

Regarding  this  approach,  two  issues  should  be  addressed.  First,  we  need  to 
compute  (3t  (•,  •)  in  terms  of  the  cumulants  via  equations  (4.7)  or  (4.8)  and  ex¬ 
pansion  (4.5),  which  is  not  straightforward  for  an  arbitrary  number  of  cumulants. 
Second,  when  we  truncate  (4.5)  to  a  finite  number  of  terms,  the  corresponding  ap¬ 
proximation  for  pXt  (x\3§t)  might  not  be  a  valid  probability  density  function,  i.e.,  it 
might  be  negative  for  some  x.  When  we  limit  the  expansion  (4.5)  to  the  first  and 
second  order  terms  (Gaussian  approximation),  these  difficulties  are  avoided.  In  this 
case,  $(•,•)  can  be  easily  calculated  (when  7 1  (•)  is  Gaussian)  and  the  truncated 
expansion  leads  to  a  valid  probability  density. 

We  have  used  the  method  above  to  approximate  pXt  (x\Mt)  with  a  Gaussian 
probability  density.  It  is  shown  in  Section  4.5.2  that  the  mean  27  and  the  covari¬ 
ance  matrix  Et  of  this  Gaussian  approximation  are  the  solutions  of  the  stochastic 
differential  equations 

dxt  =  Atxtdt  +  Btutdt  +  Mt  (r  —  Ctxt)  N  ( dt  x  dr)  —  ixtht(xt,  Et)d£ 

Ja  (4.9) 

dAt  =  AtT,tdt  +  E tAjdt  +  DtDj dt  —  MtCtYitdNt  +  ptHt(xt,  E t)dt 

with  the  initial  state  xq  =  xo  and  E0  =  E0.  Here,  we  have  Nt  —  N  ([Q,t)  x  A)  and 
Mt  =  rt(Et),  where  Tt  (•)  :  R  x  Mnxn  — >  Rnxm  is  defined  as 

W(E)=EGtT(aEGtT  +  JRi)-1. 
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Also,  ht  (•,  •)  :  Rx  Rn  x  Rnxn  — >  Rn  and  Ht  (•,  •)  :  R  x  Rn  x  Rnxn  — >  Rnxn  are 
given  by 

ht  (x,  E)  =  /  Tt  (E)  (r  -  Ctx)  4>m  (r;  C/x,  CfLCj  +  R*)  dr 
J  A 

Ht  (x,  E)  =  J  Tt  (E)  (ctECf  +  -  (r  -  Ctx)  (r  -  Ctx)T)  Tf  (E) 

•  <Lm  (r;  (7tx,  CtYjCj  +  i?j)  dr. 

Note  that  xt  and  Et  are  approximations  of  xt  and  Et,  not  their  exact  values. 

Remark  4.3.1.  The  definition  of  and  H  ( •,  • )  imply  that  as  A  — >  Rm, 

h  (•,  •)  — >  0  and  H  (•,  •)  — >  0,  and  as  a  consequence,  the  approximate  estimator  (4.9) 
tends  to  the  exact  estimator  (3.9).  In  this  sense,  we  can  say  that  (4.9)  is  an  asymp¬ 
totically  optimal  estimator. 

4.4  Control  Problem 

We  exploit  the  results  of  the  previous  section  in  solving  the  control  problem  as  stated 
in  Theorem  4.4.1  below.  Before  presenting  this  result,  we  fix  notation.  Let  E  = 
denote  a  symmetric  n  x  n  matrix  and  /  (E)  be  a  scalar  function  of  E.  Assume  that 
the  partial  derivatives  of  /  (E)  with  respect  to  the  elements  of  E  exist.  We  denote 
by  df  (E)  /<9E  a  n  x  n  symmetric  matrix  F  (E)  =  [FtJ  (E)]  such  that  Fa  =  df  /dan 


and  Fij  =  (1/2)  df  /da  for  i  j.  Let  gt  (x,  E)  be  a  scalar  function  of  x  G  Rn  and 
n  x  n  symmetric  matrix  E.  Assume  that  the  partial  derivatives  of  gt  (x,  E)  with 


respect  to  x  and  E  exist.  Define  the  linear  operator  Ct  {•}  as 


Ct  {gt  (x,  E)}  =  J  (gt(x  +  rt  (E)  (r  -  Ctx)  ,  E  -  Tt  (E)  Ct E)  -  gt  {x,  E) J 

•  <hm  (r;  Ctx,  CjECf  +  dr 

-  ( dgt  {x,  E)  /dx)T  ht  (x,  E)  +  tr  {(dgt  (x,  E)  /<9E)  Ht  (x,  E)}  . 


(4.10) 


Finally,  we  use  ||  •  \\2Pt  to  denote  (-)T  Pt  (•). 

Theorem  4.4.1.  Let  x  G  Mra  and  E  be  a  n  x  n  symmetric  matrix.  Suppose  that 
gt  (x,  E),  t  E  [0,  T]  is  the  solution  of  the  partial  differential  equation 

-!9i(i'E)=(JL‘(i'E))  B'Pr'Bf  (JLg,(x,  £)) 

+  tr  |  gt  (x,  E))  (At E  +  E Aj  +  DtDj)  +  Qt E } 

+  xTQtx  +  /if£t  {gt  (x,  E)}  (4.11) 


with  the  boundary  condition  gr  (x,  E)  =  xT Sx.  Then,  the  cost  functional  (4.4)  can 
be  expressed  as 


J  —  9o(xo,  ^o)  +  E 


r  rT  i 

'  fT 

/  Stdt 

+  E 

Jo 

Jo 

ut  +  -  Pt  lB 


_d_ 

dxi 


9t(%t,  Ej) 


dt 


pt 


(4.12) 
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where 


St 


xTQtx  +  J  (gt(xt  +  Mt  (r  -  Ctxt) ,  Et  -  MtCtY,t) 

-  gt(xu  Y,t)^\t(r,x)dr\(pXt  (x\3St)  -  Pxt  (x\@t)}dx  (4.13) 


is  the  error  term  resulting  from  replacing  the  posterior  density  pXt  (x\P3t)  by  its 
Gaussian  approximation  pXt  (x\PSt). 

Proof.  See  Section  4.5.3.  □ 

The  first  term  on  the  right  side  of  (4.12)  does  not  depend  on  ut  and  so  is 
not  involved  in  the  minimization.  While  the  hard-to-compute  error  term  <5t  in  (4.12) 
depends  on  ut,  it  is  supposed  to  be  small.  Therefore,  in  minimizing  (4.12),  we  ignore 
the  term  involving  5t  and  only  minimize  the  third  term.  We  note  that  the  minimum 
of  the  third  term  is  zero  which  is  achieved  when  ut  is  given  by 


u;  =  --p~1bJ 


_d_ 

dxi 


gt(xt ,  £<) 


(4.14) 


Then  the  cost  associated  with  u *t  will  be 

J*  =  go{xo,  So)  +  [  E  [St \ut=u*\  dt. 

Jo 

When  A  =  Rm,  a  simple  solution  can  be  obtained  for  (4.11).  This  solution 
which  is  stated  by  the  following  theorem  confirms  that  the  control  (4.14)  is  consistent 
with  that  obtained  for  A  =  Mm  by  Rhodes  and  Snyder  [20]  in  Theorem  3.3.3.  This 


indicates  that  the  suboptimal  control  (4.14)  tends  to  the  optimal  control  as  A  — >  Mm. 

Theorem  4.4.2.  When  A  =  Mm,  the  solution  of  the  partial  differential  equa¬ 
tion  (4.11)  with  the  boundary  condition  gT  (x,  E)  =  xTSx  can  be  expressed  as 

9t  (x,  S)  =  xTKtx  +  ft  (E)  (4.15) 

where  Kt  is  the  backward  solution  to  the  Riccati  equation 

kt  =  —KtAt  -  AjKt  +  KtBtP~xBTtKt  -  Qt  (4.16) 

with  Kt  =  S,  and  ft  (E)  is  the  solution  of  the  partial  differential  equation 

-4  /,  (E)  =  tf  {  (^  /,  (A  (AtT,  +  SAf  +  DtDf)  +  Q,e} 

+  (E  -  r,  (E)  C, E)  -  ft  (E))  +  fi.tr  {r,  (E)  C,SK,}  (4.17) 

with  the  boundary  condition  fr  (S)  =  0. 

Proof.  See  Section  4.5.4.  □ 

We  observe  from  (4.14)  and  (4.15)  that  when  A  =  Mm,  the  optimal  control  is 
given  by 

u‘t=-P,-1B?Ktxt  (4.18) 

with  the  optimal  cost 


J*  =  XqK0X0  +  /o  (E0)  . 


(4.19) 


While  the  optimal  control  (4.18)  has  been  obtained  by  Rhodes  and  Snyder  [20],  the 
value  of  the  corresponding  optimal  cost  (4.19)  is  newly  obtained  here. 


4.5  Proof  of  the  Theorems 

4.5.1  Proof  of  Theorem  4.3.1 

The  Fourier  transform  of  (3.7)  is  given  by  [20]  as 


*  w = E  b {juTxt)  bT  {AtXt + BtUt)  -  s  uT‘ DtD‘“) 


dt 


E 


l  A 


exp  (juTxt)  (At  (r,  xt)  Xt  1  (r)  N  ( dtxdr ) 


E 


'A 


exp  (juTxt)  (At  (r,  xt)  -  Xt  (r))  \BSt 


drdt. 


(4.20) 


Let  t0  =  0  and  t\  <  t2  <  f3  <  ■  ■  ■  be  the  occurrence  times  of  the  space-time  process 
N  (T  x  S).  During  the  interval  (4, 4+i)>  k  =  0, 1,  2, . . .,  the  first  integral  on  the 
right  side  of  (4.20)  is  zero,  thus  we  can  write 


d exp  {^t  (Ml  =  E 


exp  (, juTxt )  (  juT  (Atxt  +  Btut)  -  -u )TDtDju 


dt 


~  E 
JA 


exp  (. j(jJTxt )  (At  (r,  xt)  -  Xt  (r)) 


drdt. 


Using  the  identity 


E  [xt  exp  (juTxt)  \&t_ 


—  E  [exp  (MM)  \3St] 
^MM  r  /  /  •  NT 

Qjto  exP  WMI 
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we  rewrite  this  equation  as 


,T  (  A  ,  D  „  \  1  ,  ,Tn  nT, 


exp  {^(jcu)}  di(t(ju)  =  exp  {^(jcu)}  ju1  {At—^— - f  Btut  j  -  -u1  DtD(  u  dt 


exp  {juTxt)  {\t  (r,  ay)  -  (r))  drdt. 


Multiplying  both  sides  of  this  equation  by  exp  {—'i(t(ju)}  and  substituting  (3t  (r,ju) 
from  (4.7)  into  the  result,  we  obtain 


r  [  ,  dt/>t(Jv) 


dipt  (. ju )  =  ju  (^/l/  - h  BtutJ  dt 

-  ]-uTDtDjudi  -  I  {(5t  (r,  ju)  -  (3t  {r,0))drdt.  (4.21) 


The  discontinuity  at  t  =  tk  is  treated  as  follows.  Let  rk  be  the  spatial  compo¬ 
nent  of  the  event  occurring  at  tk-  Then,  from  (4.20)  we  get 


</>t+  {ju)  -  (j)t-  {ju)  =  E  exp {juTxt-)  (\-  {rk,  xt-)\}  {rk)  -  l)  38 t] 


which  can  be  simplified  as 


</>t+  M  =  E  exp {juTxt- )  (rk,  xt- )  1 38 t-  \}  {rk) 


Multiplying  both  sides  of  this  equality  by  exp{—  ijjt-  (ju) }  and  taking  logarithm,  we 
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obtain 


At  M  “  Ar  UA 

k  k 


=  In  (exp  (ju)}  E  \exp(juTxt-)Xt-(rk,xt-) 


=  In  fit-  ( rk,ju )  -  In  (3t-  (rk,  0) . 


In  (■ rk ) 


Combining  this  result  with  (4.21)  and  replacing  ju  by  s,  we  obtain  (4.6). 
From  the  definition  of  f3t  (r,  ju)  in  (4.7),  we  have 


Pt  (u  ju)  =  exp  (ju)}  /  pxt  (x\@t)  exp  ( juTx )  Xt  (r,  x)  dx.  (4.22) 


Note  that  pXt  (x\38t)  is  the  Fourier  transform  of  exp  {jjt  (ju)},  and  so  we  can  write 


Pxt  {x\BSt) 


1 

(27r)n 


exp  {-0t  (j v) }  exp  (—jvTx}  dv. 


Upon  substituting  this  expression  into  (4.22)  and  interchanging  the  order  of  inte¬ 
gration1,  we  obtain 


Pt(r,jui)  = 


(2w) 


exp {A(ju)  ~  AtiA  }  /  At  (r,  x)  exp  <  —  j  (v  —  u)  a;  J-  dxdv. 


Replacing  the  second  integral  above  by  At  ( r,jv 


ju)  and  changing  the  variable  of 


1This  interchange  is  permissible  since  for  any  fixed  t,  the  integrand  is  continuous  in  x  and  z/. 
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integration  v  with  v  +  u>,  we  get 


Pt  (d  jw)  =  ttttt  /  (d  jv)  exp {^(jV  +  ju)  -  ipt(Ju)  }du- 

(Z7r)  JRn 

Finally,  we  obtain  (4.8)  upon  replacing  juj  with  s. 

4.5.2  Derivation  of  (4.9) 

We  first  state  a  technical  lemma  from  [43]  which  will  be  used  later  in  deriving  (4.9). 
For  sake  of  completeness,  we  repeat  below  the  proof  from  [43]. 

Lemma  4.5.1.  Let  zk,zk  G  Zi  G  M.1,  and  0^  and  0/  be  respectively  kxk  and 
Ixl  positive  definite  matrices.  Assume  that  G  is  any  Ixk  matrix.  Then,  we  have 


{zk\  Zk,  0fc)  (zf,  Gzk ,  0/)  dzk  —  ( zi ;  Gzk,  0;  +  GQkGT)  . 


(4.23) 


Proof.  Denoting  the  Fourier  transform  of  the  left  side  of  (4.23)  by  F)  (ay),  we  can 
write 


Fi  (ay)  =  /  <f>fc  {zk  \  zk,  0fc)  exp  [jujjGzk  -  \  ayT0;ay)  dzk 
J] Rfc 

=  exp  (' jujjGzk  -  \ujjGQkGT(jJi)  exp  (~\u 
=  exp  (jujjGzk  -  (0/  +  GQkGT )  l of)  . 

Taking  inverse  Fourier  transform  of  the  expression  above,  we  get  the  right  side 
of  (4.23).  □ 


93 


The  probability  density  function  associated  with  the  truncated  expansion 
A(s)  —  sTXtJr\  sTEts  is  Gaussian  with  mean  xt  and  covariance  matrix  Xt.  With  this 
approximate  probability  density  function  and  with  \t  (r,  xt)  =  fit&m  (n  Ctxt,  Rt),  the 
approximation  of  is  given  by 


Pt  A,  s )  =  exp 


~A  ( s ) 


A,  A)  exp  ( sTx )  (r;  Ctx,  Rt)  dx. 


A  simple  calculation  yields  that 


exp 


-A  (s)  j  (x;  xt,  St)  exp  (sTx)  =  <f>„ (x;  xt  +  Xts,  S4) . 


Then,  using  Lemma  4.5.1,  we  get 


fit  (r s)  —  Ik^m  [r-,  Ctxt  +  CtEts,  CtttC?  +  R^j 


which  leads  to 


In  A  (r,  s )  -  In  fit  (r,  0)  =  s  X tC{  (  CtEtC{  +  Rt )  (r  -  Ctxt) 

-l 


-l 


x  sTEtCf  (  CtEtCj  +  Ri 


CtEts 


(4.24) 
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and 


Pt  (u  s)  -  j3t  (r,  0)  =  /zt$m  ^r;  Ctxt,  CtEtC?  + 

•  | sTttCj  (ctttCj  +  Rt^j  1  (r  -  Ctxt) 

-i  1 2 

sTStCf  (c'tEtCf  +  i?t)  (r  -  Ctxt) 

-  i  sTStCf  (ctStCf  +  i?t) '  Cttts  +  O  (||s||3)  |.  (4.25) 

We  combine  (4.24),  (4.25),  and  (4.6)  and  match  the  coefficients  of  sT(-)  and  sT(-)s 
from  both  sides  to  obtain  (4.9). 

4.5.3  Proof  of  Theorem  4.4.1 

Our  proof  consists  of  the  following  four  steps. 

Step  I:  Using  the  properties  of  conditional  expectation,  it  is  easy  to  show  that 


E  [xfQtXt]  =  E  xjQtxt  +  tr{<3tEi} 


Then  the  cost  functional  (4.4)  can  be  expressed  as 


J  =  E 


Uo 


xjQtXt  +  trlQjEj}  +  uJPfUt  +  Sj)  dt  +  x^Sxt 


(4.26) 


where  is  defined  as 


Si  =  tr  |  Qt  (xtxj  -  xtxj  +  E t-  Ei  j 
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Step  II:  For  t  ^  0  and  for  any  positive  e,  A Nt  =  Nt+€  —  Nt  is  a  conditionally  Poisson 
random  variable  with  the  stochastic  rate 

A  t  =  I*  J  Ar  (r>  xt )  dr  dr. 

Thus,  using  the  law  of  total  probability,  we  can  write 

Pr  {A =  l| B8t)  =  E  [Pr  {A At  =  1|A|}  \®t] 

=  E  [A^ exp  (-A^)  \$8t\ 

=  eqt  +  0  (e2)  (4.27) 


where  qt  is  defined  as 

qt  =  f  E  [A*  (r,  xt)  1 3§t]  dr.  (4.28) 

Ja 

In  a  similar  manner,  we  can  show  that 


Pr  (A At  =  0|^t}  =  1  -  eqt  +  O  (e2) 
Pr{AAt^2|^t}=0(e2). 


(4.29) 


Let  the  random  vector  R  G  Rm  denote  the  location  of  a  single  event  occurring 
during  [t,  t  +  e).  We  show  that 


pR  (r|AAt  =  l,^t) 


E  [At  (r,  xt)  | S8t_ 
qt 


I  A  (r)  +  O  (e) 


(4.30) 


where  I A  M  =  1  if  r  G  A  and  Ia{t)  =  0  otherwise.  For  this  purpose,  let  T>  (r)  C  A 
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denote  a  m-dimensional  cube  with  side  length  A r  which  is  centered  at  r  e  A. 
Defining  &  —  [f,  t  +  e)  and  using  Bayes’  rule,  we  can  write 

pR  (r\ANt  =  1, 3§t)  =  lim  A r~2  Pr  {R  e  V  ( r )  |  A Nt  =  1 ,  } 

u  '  Ar^O 

=  lim  Ar-2  Pr  {iV(^xD  (r))  =  l|AJVt  =  1, 

Ar—>0 

1  Pr  {N  (£?  x  V  (r))  =  1,  N  (£?  x  A)  =  l\&t} 
Ar^o  Ar2  Pr  {A/h  =  1| 

(4.31) 

Note  that  the  event  of  iV(^xD(r))  =  1  and  N  (£?  x  A)  =  1  is  equivalent  to 
the  event  of  N  (TP  x  V  (r))  =  1  and  N  (A7'  x  (A  —  V  (r)))  =  0.  Therefore,  defining 
!%t  =  {^r|  r  G  TP}  and  using  the  law  of  total  probability  and  properties  of  a  space- 
time  point  process,  we  get 

Pr  {N  (TP  x  V  (r))  =  1,  N  x  A)  =  1| BSt} 

=  E[Pr{A(^  x  V{r))  =  1,  N  x  {A-V{r)))  =  0|  2£t,38t}  \@t 
=  E[Pr  {N(&  xV{r))  =  l|^}Pr  {N  x  (A  -  V  (r)))  =  0\3Tt}  \&t 
—  E  /  At  (s,  xt)  drds  (l  —  O  (eAr2))  SSt 

J STxV(r) 

=  eAr2E  [Xt  (r,  xt)  \&t\  +  O  (eAr3)  +  O  (e2A r2)  .  (4.32) 

Substituting  (4.27)  and  (4.32)  into  (4.31),  we  obtain  (4.30). 

Assume  that  pXt  (x\3§t)  is  the  Gaussian  approximation  of  pXt  {x\3§t).  Then, 


97 


using  Lemma  4.5.1,  we  can  write 


E  [At  (r,  xt)  | @tm 


/  Pxt  (M^t)  K  (r,  x)  dx 

J  R™ 

+  /  (pxtiA^t)  -pXt{x\@t))h{r,x)dx 


—  A CtXt ,  CfT^tCj  +  At) 

+  /  (pxt  (x|^t))  At  (r,  x)  dx. 

J  R"  V  / 


(4.33) 


Step  III:  Let  gt  (x,  E)  be  a  scalar  function  of  x  G  Mn  and  n  x  n  symmetric  matrix  E. 
Assume  that  the  partial  derivatives  of  gt  (x,  E)  with  respect  to  t,  x,  and  E  exist. 
Using  the  law  of  total  probability  we  can  write 


E 


9t.+e(xt+€,  Et+e)  | 


E  \(h+  e  (^i+e)  ^i+e) \^ti  Pr  { ANf  A  |^t} 


fc=0 


Replacing  Pr  { ANt  =  k\&t}  from  (4.27)  and  (4.29)  into  this  expression,  and  using 
the  law  of  total  probability  again,  we  find 


E 


gt+6(xt+e,  Et+e)  \SSt  =  E  gt+€(xt+€,  Et+e)  \&t,  A Nt  =  0  (1  -  eqt) 


E 


E 


gt+e(xt+e,Y,t+e)\&t,  ANt  =  1,R  =  r  \&t,  ANt  =  1  eqt  +  O  (e2) 


Substituting  (4.28)  and  (4.30)  into  this  equality  and  rearranging  terms,  we  obtain 


E  gt+e(xt+ei  Ei+e)  \&t  —  E  gt+e{%t+ei  Et+e)  \&ti  —  0 
+  eJ^E  gt+e(xt+e,Et+€)\&t,ANt  =  l,R  =  r 

-E  gt+e{xt+e,Y,t+e)\&u  ANt  =  0  ^  E  [At  (r,  xt)  \3§t[  dr  +  O  (e2)  .  (4.34) 

Conditioned  on  38 t  and  ANt  =  0,  (4.9)  can  be  solved  during  [t,t  +  e)  to  obtain 


xt+e  =  xt  +  eAtxt  +  eBtut  -  eidtht(xt,  tt)  +  O  (e2) 

Ei+e  =  Ef  +  eAfAt  +  eE tAj  +  eDtD J  +  e/q/A (ay,  Ej)  +  O  (e2)  . 


(4.35) 


Also,  conditioned  on  38t,  A Nt  =  1,  and  R  =  r,  we  can  write 


xt+e  =  xt  +  Mt  (r  -  Ctxt)  +  O  (e) 
=  tt-  MtCttt  +  O  (e) . 


(4.36) 


Substituting  (4.35)  and  (4.36)  into  (4.34),  and  linearizing  with  respect  to  e,  we  get 


Et+e)|  38t  =  gt(xt,  Et) 


e^~  gt (xt,  Ei)  +  e  (  gt (xt,  E t)  ^  (Xxt  +  Btut  -  gtht  (xt,  E t 


+  etr 


{ (Jr  + ^ + DtD* +  ^)^T) } 


+  e  J  \gt  [xt  +  Mt  (r  -  Ctxt) ,  Et  -  MtCtT,tJ  ~  gt(xt ,  Et)J  E  [A*  (r,  xt)  \38t\  dr 

+  o  (e2)  . 
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We  replace  E  [At  ( r,xt )  \&t]  from  (4.33)  into  the  expression  above  and  use  the  linear 
operator  Ct  {•}  defined  by  (4.10)  to  obtain  the  simplified  form 


E 


9t(xt+ei  ^-h+e)  | 


~  c)  ~ 

=  9t  (xt,  St)  +  e—  (jt  (xt,  St) 

gt(xt,tt)^j  ( Atxt  +  Btut ) 

Et)j  (Attt  +  ttAj  +  DtDj 
+  \  9t{xt ,  E:i)  !•  +  eA2  +  O  (e2)  (4.37) 


A 

<954 


+£tr{di9‘(i,': 


where  the  error  term  52  is  dehned  as 


A 


9t(xt  +  Mt  (r  -  Ctxt)  ,  Et  -  MtCtT,tj  -  gt(xt,  Et) 

xl^t)  - pXt  (x\&t))\t  (r,  x)  drdx. 


Dehne  the  nonlinear  operator  JCt  {•}  as 

o  /  o  \  T1 

/C4  {&  (x,  YAj}  =  —gt  (x,  E)  +  (  —  gt  (x,  E)  J  7Ltx 

+  tr  {  <7t  (a,  E))  (^tE  +  EA^  +  DtDj)  +  QtE| 

+  xTQtx  +  pt£t  {gt  (x,  E)}  . 
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Then,  (4.37)  can  be  written  as 


1  f  d  \  2 

E  gt+e{xt+ei^t+e)\3$t  —  9t  {%ti  ^t)  +  e  ut  +  ^  1  Bj  yQ~T  gti^ti^t)  J  +  e^t 

-  e  (xjQtxt  +  tr  {qa}  +  uj Ptu^j 
+  e/Q  1 9t(%t,  Et)  |  +  O  (e2)  .  (4.38) 

Let  gt  (•,  •)  be  the  solution  of  the  partial  differential  equation  (4.11)  with  the  bound¬ 
ary  condition  qt  (x,  E)  =  xTSx.  This  implies  that  JCt{gt(xt,  St) }  =  0.  Under  this 
condition,  we  take  expectation  from  (4.38)  to  get 

E  gt+e(xt+e,^t+e)  —  E  gt(xt,tt)  +  eE  ut  +  ^P~1B'[  gt (xt,  tt)^j 

+  eE  [<52]  —  eE  xjQtXt  +  tr  |  +  uj PfUf  +  O  (e2)  . 

(4.39) 

Step  IV:  We  partition  the  interval  [0,  T )  into  K  subintervals  [tk,  4+i),  A;  =  0, 1, ... , 
K  —  1,  where  t0  =  0,  tK  =  T,  and  tk+i  —  tk  =  e*,  >  0.  Recalling  that  x^Sxt  = 
gtK  {%tK ,  ,  we  approximate  the  cost  functional  (4.26)  by  the  finite  sum 

K- 1 

J  -  Jk  =  Y1  efcE  +  tr  [Qtk^tk  }  +  uJkPtkutk  +  Slk  +  E  gtK  (xtK,ttK) 

k= 0 
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and  rearrange  it  as 


—Z 

JK  =  Y  ekE  xJkQtkxtk  +  tr  \^QtkEtk }  +  uJkPtkutk  +  8]k  +  E  8]K1 


+  e^-iE  xf^Qt^Xt^  +  tr  +  uf^Pt^Ut^ 

+  E  gtK(xtK,  tttK)  . 


In  this  expression,  we  replace  E  gtK(xtK,EtK)  by  the  right  side  of  (4.39).  With 
minor  manipulations,  and  upon  defining  8t  =  8}  +  8 \  according  to  (4.13),  we  find 


—Z 

■h<  =  Y  ekE  +  tr  {QtkZtk }  +  i'lPtkutk  +  8]k  +  E  gtK_x  (xtK_1,'EtK_1) 


+  eK-iE  UtK-i  +  - 


+  £k- iE  StK_ i  +  O  (e^-_ i)  . 


'  9tK- 1  (j^tK—l  >  ^Pf-l 


Repeating  this  procedure  for  k  =  K  —  2,  K  —  3, . . . ,  1,  0,  we  obtain 


li  —1 

JK  =  E  gt0  (xt0,  tt0)  +  Y,  €k  E  [  Kl 


YekE  Ut*  +  o  PtklBtk 


2  k  B tk  f  o  ~  9tk  {ptk  1  ^tfe ) 

\  tk 


2  1  if-1 

+  0  (e^) 

fc=0 


Finally,  we  take  the  limit  of  Jk  as  K  — >  oo  and  max — >  0  to  obtain  (4.12). 
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4.5.4  Proof  of  Theorem  4.4.2 

For  A  =  Mm  and  gt  (x,  E)  given  by  (4.15),  we  can  show  that 


Ct  {gt  (x,  E)}  —  ft(E  —  Tt  (E)  CtE )  -  ft  (E)  +  tr  {Tt  (E)  CtEKt} 


which  clearly  does  not  depend  on  x.  Therefore,  (4.11)  can  be  decomposed  into 
two  decoupled  equations:  the  partial  differential  equation  (4.17)  with  the  boundary 
condition  /t  (E)  =0  and  the  equation 


d  ( xTKtx ) 

Ft 


( xTKtx) 


dx 


T 


Atx 


1  f  d  (xTKtx) 
4  1  dx 


+  xTQtx 


with  the  boundary  condition  xT KtX  =  xTSx.  This  equation  holds  for  any  arbi¬ 
trary  x  if  and  only  if  Kt  satisfies  (4.16)  with  the  terminal  condition  AY  =  S. 
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Chapter  5 


Cooperative  Optical  Beam  Tracking:  Concept  and  Model 

5.1  Introduction 

In  free-space  optical  communication  using  narrow  laser  beams,  it  is  required  to 
maintain  the  optical  alignment  between  the  stations  in  spite  of  their  relative  mo¬ 
tion.  This  relative  motion  is  caused  by  the  mobile  nature  of  the  stations,  mechanical 
vibration,  or  accidental  shocks.  In  order  to  establish  and  maintain  a  free-space  op¬ 
tical  link,  a  two-phase  optical  alignment  mechanism  is  required.  In  the  first  phase, 
a  coarse  alignment  is  achieved  through  the  open-loop  operation  of  spatial  acquisi¬ 
tion  [41,  44],  Following  the  coarse  alignment  phase,  data  transmission  is  established 
and  simultaneously  a  closed-loop  fine  alignment  operation  is  performed  to  precisely 
compensate  for  the  persistent  relative  motion  of  the  stations.  A  possible  scheme  to 
achieve  this  fine  alignment  is  cooperative  (reciprocal)  optical  beam  tracking. 

A  cooperative  optical  beam  tracking  system  consists  of  two  stations  in  such  a 
manner  that  each  station  points  its  optical  beam  toward  the  other  one.  The  receiving 
station  continuously  measures  the  arrival  direction  of  its  incident  optical  beam  in 
order  to  employ  it  as  a  guide  to  precisely  point  its  own  beam  toward  the  other 
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station.  In  short  range  applications  with  negligible  light  propagation  delay,  this 
direction  is  approximately  along  the  line-of-sight  of  the  stations,  thus  the  stations 
transmit  their  optical  beams  along  this  measured  direction.  In  applications  with 
a  large  propagation  delay,  the  optical  beams  must  be  transmitted  within  a  certain 
angle  with  respect  to  the  arrival  direction  in  order  to  compensate  for  the  variation 
of  the  line-of-sight  during  the  travel  time  of  the  transmitted  beams.  This  requires 
the  transmitter  to  predict  the  future  location  of  the  receiver  and  point  its  optical 
beam  toward  the  predicted  location. 

To  implement  the  alignment  scheme  above,  the  stations  are  equipped  with  a 
position-sensitive  photodetector  (e.g.,  quadrant  detector)  and  a  focusing  lens  (or  an 
arrangement  of  curved  mirrors)  to  measure  the  azimuth  and  elevation  components  of 
the  beam  arrival  direction.  In  addition,  each  station  employs  an  electromechanical 
pointing  assembly  to  adjust  the  direction  of  its  optical  devices  according  to  the 
control  signals  provided  by  a  closed-loop  controller.  The  controller  incorporates 
the  output  of  the  position-sensitive  photodetector  and  generates  proper  azimuth 
and  elevation  control  signals.  As  an  alternative  (or  complement)  to  adjusting  the 
transceiver  direction,  the  incoming  and  outgoing  optical  fields  can  be  directed  using 
an  arrangement  of  steerable  flat  mirrors. 

The  goal  of  this  chapter  is  to  develop  a  mathematical  model  for  a  cooperative 
optical  beam  tracking  system,  which  includes  the  nonlinear  effects,  major  distur¬ 
bance  sources,  and  light  propagation  delay.  For  analyzing  the  optical  alignment 
between  two  fast  maneuvering  stations  (e.g.  aircrafts),  the  nonlinearity  of  the  dy¬ 
namical  equations  is  essential;  however,  in  applications  such  as  intersatellite  conmiu- 
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nication  in  which  the  relative  motion  consists  of  a  predetermined  large  component 
and  an  unknown  small  component,  we  can  linearize  the  nonlinear  dynamics  around 
a  nominal  state  trajectory. 

In  the  last  section,  we  shall  describe  the  relative  motion  of  the  stations  by 
means  of  a  set  of  stochastic  differential  equations.  This  stochastic  model  will  be 
used  later  in  Chapter  6  for  a  stochastic  analysis  of  the  system,  as  an  alternative  to 
the  deterministic  approach  of  [21,  22], 

5.2  System  Architecture 

In  this  section,  we  first  consider  the  structure  and  components  of  an  optical  trans¬ 
ceiver  and  then  describe  the  operation  of  a  cooperative  optical  beam  tracking  system 
which  employs  two  transceivers  of  this  type. 

5.2.1  Transceiver  Structure 

A  schematic  diagram  of  a  simple  transceiver  used  in  short  range  free-space  optical 
links  is  illustrated  in  Figure  5.1  (see  also  [1]).  This  transceiver  comprises  a  lens,  a 
position-sensitive  photodetector,  and  a  narrow  laser  source,  all  installed  on  a  rigid 
platform.  The  photodetector  surface  is  perpendicular  to  the  lens  axis  and  its  center 
is  placed  at  the  focus  of  the  lens.  The  axes  of  the  lens  and  the  laser  source  are 
parallel  to  transceiver  axis.  The  azimuth  and  elevation  of  the  transceiver  axis  can 
be  controlled  by  means  of  an  electromechanical  pointing  assembly,  which  is  mounted 
on  the  station  body. 

The  optical  beam  generated  by  the  laser  source  is  used  for  two  purposes:  as  a 
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Figure  5.1:  Schematic  diagram  of  an  optical  transceiver  for  short  range  applications. 

carrier  of  information  and  as  a  beacon  assisting  the  opposite  station  in  its  tracking 
and  pointing  operations.  For  the  purpose  of  communication,  the  instantaneous  laser 
power  is  modulated  by  the  information-bearing  signal,  usually  with  a  digital  form 
of  on-off-keying. 

The  position-sensitive  photodetector  is  a  photoelectron  converter  whose  sur¬ 
face  is  partitioned  into  small  regions.  The  output  of  each  region  counts  the  number 
of  converted  electrons  regardless  of  their  location  on  the  region.  The  photoelectron 
conversion  rate  depends  on  the  instantaneous  optical  power  absorbed  by  the  region. 
The  image  of  the  received  optical  field  on  the  surface  of  the  photodetector  is  a  spot 
of  light  with  a  bell-shaped  intensity  pattern  whose  location  depends  on  the  angle 
of  arrival  of  the  optical  held  with  respect  to  the  transceiver  axis.  Hence,  using  the 
position-sensitive  photodetector,  this  angle  can  be  tracked  by  measuring  the  loca¬ 
tion  of  the  spot  of  light.  Many  practical  optical  beam  tracking  systems  employ  a 
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quadrant  detector1  as  their  optical  sensing  device,  while  the  low  spatial  resolution 
of  a  quadrant  detector  can  be  improved  using  a  finer  partition.  For  instance,  the 
authors  of  [45]  describe  a  beam  tracking  system  which  employs  a  photodetector  with 
512  x  512  pixels. 

The  pointing  assembly  is  usually  a  two-axes  gimballed  system  with  two  inde¬ 
pendent  motor  which  control  the  azimuth  and  elevation  of  the  transceiver.  Gim¬ 
balled  pointing  systems  generally  suffer  from  low  bandwidth  (in  order  of  10  Hz) 
and  low  slew  rate,  while  being  able  to  cover  a  large  solid  angle.  Also,  they  have 
the  disadvantage  of  being  singular  at  certain  points,  which  limits  their  coverage  re¬ 
gion  [46].  To  resolve  this  difficulty,  Omni- Wrist  III  is  an  alternative  antenna  pointer 
with  double  universal  joints  and  linear  actuators,  which  has  27t  steradian  range  of 
motion  without  singularity  [46]. 

A  more  sophisticated  transceiver  design,  used  for  intersatellite  communication, 
is  illustrated  in  Figure  5.2  (for  detailed  discussion  see  [47,  48]).  Similar  to  Figure  5.1, 
this  design  employs  a  position-sensitive  photodetector,  a  pointing  assembly,  and  a 
laser  source;  however,  instead  of  a  lens,  it  employs  a  reflecting  telescope. 

The  telescope  which  is  shared  between  the  receiving  and  the  transmitting  op¬ 
tics,  consists  of  a  primary  and  a  secondary  curved  mirror  with  one  of  the  several 
common  designs.  The  most  popular  [47]  design,  Cassegrainian  telescope,  employs  a 
parabolic  primary  mirror  and  a  hyperbolic  secondary  mirror  which  share  the  same 
focus.  In  addition  to  the  telescope,  an  arrangement  of  lenses  (not  shown  in  Fig¬ 
ure  5.2)  might  be  used  for  extra  magnification  [47].  In  design  of  the  transceiver, 
1  Quadrant  detector  is  a  position-sensitive  photodetector  with  a  four-region  partition. 
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Figure  5.2:  Optical  transceiver  for  intersatellite  communication  (based  on  [48]). 


the  incoming  and  outgoing  optical  fields  must  be  isolated  as  much  as  possible,  since 
the  backscattered  photons  caused  by  the  outgoing  light  emerge  as  a  source  of  noise 
for  the  photodetector.  This  can  be  achieved  by  a  combination  of  spectral  isolation, 
spatial  separation,  and  polarization  isolation  [47].  In  the  situations  that  these  tech¬ 
niques  cannot  provide  enough  isolation,  two  separated  telescopes  are  required  for  the 
incoming  and  the  outgoing  optical  beams  [47],  while  this  dual  telescope  approach 
leaves  the  tracking  function  of  the  transceiver  unchanged. 

The  tracking  mirror  in  Figure  5.2  is  intended  to  control  the  direction  of  the 
incoming  light  toward  the  position-sensitive  photodetector  and  the  outgoing  light 
toward  the  target.  This  steerable  flat  mirror,  which  is  equipped  with  miniature  ac¬ 
tuators,  provides  a  complementary  (or  alternative)  means  for  the  pointing  assembly. 
The  steering  machinery  consists  of  a  support  plate  with  a  single  pivot  and  three  or 
four  piezoelectric  linear  actuators  (fast  steering  mirror).  Although,  the  scanning 
region  of  a  steerable  mirror  is  small  (less  than  5  degrees  in  each  direction),  its  small 
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mass  and  fast  actuators  result  in  a  high  bandwidth  (up  to  1  kHz)  and  high  slew 
rate.  This  provides  considerable  assistance  to  the  pointing  assembly  in  suppressing 
the  high  bandwidth  disturbance. 

The  point-ahead  mirror  is  another  steerable  flat  mirror  with  the  purpose  of 
compensating  for  the  displacement  of  the  receiver  during  light  propagation  time. 
This  mirror  provides  an  additional  degree  of  freedom  in  controlling  the  pointing 
direction  of  the  outgoing  light. 

5.2.2  The  Concept  of  Cooperative  Optical  Beam  Tracking 

We  consider  a  two-way  optical  link  consisted  of  two  transceivers  of  the  type  discussed 
earlier,  in  such  a  manner  that  each  station  transmits  its  optical  beam  toward  the 
other  station  and  receives  the  optical  beam  from  the  other  side.  We  assume  that 
the  stations  are  subjected  to  relative  motion. 

For  a  simple  description  of  the  alignment  scheme,  consider  the  transceiver  of 
Figure  5.1  and  suppose  that  a  uniform  optical  field  strikes  the  transceiver  aperture 
(i.e.,  the  lens).  When  the  striking  optical  field  propagates  along  the  axis  of  the 
lens,  its  image  is  a  spot  of  light  at  the  center  of  the  position-sensitive  photodetector, 
while  any  deviation  from  this  direction  shifts  the  spot  of  light  from  the  center. 
This  shift  can  be  detected  by  the  position-sensitive  photodetector.  The  output  of 
the  photodetector  is  fed  to  a  closed-loop  controller,  which  adjusts  the  transceiver 
direction  by  applying  proper  signals  to  the  pointing  assembly.  For  short  range 
applications  (negligible  propagation  delay),  the  goal  of  the  controller  is  to  eliminate 
the  angle  between  the  lens  axis  and  the  arrival  direction  of  the  incident  optical  beam. 
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Since  the  axes  of  the  lens  and  the  laser  source  are  parallel,  this  operation  aligns  the 
propagation  direction  of  the  transmitted  optical  beam  with  the  arrival  direction 
of  the  received  optical  beam.  Assuming  that  both  stations  actively  perform  this 
operation,  the  propagation  direction  of  the  optical  beams  stay  close  to  the  line-of- 
sight,  in  spite  of  the  relative  motion  between  the  stations.  The  block  diagram  in 
Figure  5.3  illustrates  the  interconnection  between  the  components  of  a  cooperative 
optical  beam  tracking  system. 


Rotation  Translational  Motion  Rotation 


Figure  5.3:  Interconnection  between  the  components  of  a  cooperative  optical  beam  track¬ 
ing  system. 

In  the  optical  transceiver  of  Figure  5.2,  the  task  of  adjusting  the  light  direction 
is  distributed  between  the  pointing  assembly  and  the  tracking  mirror.  In  applications 
such  as  intersatellite  communication,  the  relative  motion  consists  of  a  large,  slowly 
varying  component  and  a  small,  high  bandwidth  term.  Accordingly,  the  control 
law  consists  of  an  open-loop,  coarse  control  and  a  closed-loop,  fine  control.  In  this 
case,  a  reasonable  design  employs  the  pointing  assembly  for  the  purpose  of  coarse 
(open-loop)  control  and  the  tracking  mirror  for  closed-loop  fine  control. 

In  the  transceiver  of  Figure  5.1,  a  successful  tracking  operation  which  keeps 
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the  spot  of  light  close  to  the  center  of  the  photodetector,  requires  the  transmitted 
optical  beam  to  propagate  along  the  arrival  direction  of  the  received  optical  beam; 
however,  in  applications  with  a  large  propagation  delay,  the  optical  beam  must  be 
pointed  ahead  with  respect  to  this  arrival  direction.  The  required  point-ahead  angle 
can  be  accommodated  by  means  of  the  point-ahead  mirror. 

5.2.3  Assisting  Equipments 

Application  of  inertial  sensors  (gyro  and  accelerometer)  in  the  alignment  aspect  of 
intersatcllite  optical  communication  is  considered  in  [49].  Through  measuring  the 
angular  velocity  and  acceleration,  these  sensors  provide  information  regarding  the 
position  of  the  stations.  This  information  can  be  combined  with  the  output  of  the 
photodetectors  to  improve  the  overall  performance  of  the  system. 

Another  possibility  for  improving  the  performance  of  the  system  is  to  exchange 
information  between  the  stations.  Sharing  the  output  of  the  photodetectors  and  the 
inertial  sensors  enables  each  individual  station  to  produce  a  more  accurate  estimate 
of  the  line-of-sight,  which  in  turn,  increases  the  capability  of  the  stations  to  com¬ 
pensate  for  their  relative  motion.  The  means  for  exchange  of  information  can  be 
provided  by  the  optical  channel  itself  or  an  independent  low  bandwidth  RF  channel. 

In  the  optical  transceiver  of  Figure  5.1,  an  additional  pointing  error  can  be 
introduced  by  a  misalignment  between  the  axes  of  the  lens  and  the  laser  source. 
This  misalignment  might  occur  due  to  the  imperfect  manufacturing  process.  The 
same  type  of  pointing  error  arises  in  the  optical  transceiver  of  Figure  5.2,  because 
of  the  imperfect  positioning  of  the  laser  source  and  the  point-ahead  mirror.  Note 
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that  this  type  of  pointing  error  can  be  measured  only  by  the  receiving  station,  thus, 
in  order  to  eliminate  it,  the  stations  need  to  share  their  outputs.  This  is  another 
reason  which  indicates  that  information  exchange  between  the  stations  improves  the 
performance  of  the  system. 

5.3  The  Model 

In  this  section,  we  develop  a  mathematical  model  for  a  cooperative  optical  beam 
tracking  system.  In  order  to  avoid  unnecessary  complications  in  notation,  we  shall 
assume  that  the  optical  link  under  consideration  consists  of  two  identical  transceivers 
(Figure  5.1  or  Figure  5.2).  We  begin  by  fixing  notation  and  defining  necessary  co¬ 
ordinate  systems.  Then  we  determine  the  optical  held  on  the  aperture  of  each 
transceiver,  which  will  be  used  later  to  derive  a  formula  for  the  optical  intensity 
on  the  photodetector  surface.  This  will  be  followed  by  discussing  the  effect  of  at¬ 
mospheric  turbulence  on  the  optical  intensity.  Next,  we  shall  present  a  statistical 
description  for  the  photodetector  output  in  terms  of  the  optical  intensity.  Finally, 
we  introduce  the  dynamical  equations  which  describe  the  temporal  evolution  of  the 
system. 

5.3.1  Notation  and  Coordinate  Systems 

In  what  follows,  we  distinguish  the  stations  by  superscripts  a  and  b  or  i  —  a,b  when 
referring  to  both  stations.  Let  X  be  an  inertial  coordinate  system.  Consider  the 
position  vector  of  station  %  with  respect  to  the  origin  of  X  and  denote  by  r\  its 
representation  in  coordinate  system  X  at  time  t.  The  light  travel  time  between  the 
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stations  (with  abuse  of  notation)  will  be  denoted  by  t([  and  can  be  determined  from 


td  =  c 


-i 


where  c  is  the  light  velocity  in  the  propagation  medium  (vacuum).  Note  that  in 
general,  td  depends  on  time  which  is  not  emphasized  by  this  notation. 

For  station  i  and  at  time  t,  we  define  the  coordinate  system  Til  which  is 
attached  to  the  center  of  the  transceiver  aperture  in  such  a  manner  that  its  z  axis 
extends  outward  perpendicular  to  the  aperture  plane,  its  x  axis  is  parallel  to  the 
elevation  axis  (see  Figure  5.1  and  Figure  5.2),  and  its  y  axis  is  the  cross  product 
of  z  and  x  axes.  We  denote  by  Vt\  the  rotation  matrix  from  coordinate  system  X  to 
coordinate  system  7 1\.  Let  B\  be  a  coordinate  system  fixed  to  the  body  of  station  i 
and  denote  by  c o\  the  angular  velocity  of  X  with  respect  to  B\  represented  in  X.  The 
angular  velocity  of  B\  with  respect  to  7 1\  represented  in  7 1\  will  be  denoted  by  v\. 

We  shall  assume  that  the  optical  held  transmitted  by  station  i  propagates 
along  the  z  axis  of  the  coordinate  system  7)*.  This  coordinate  system  is  obtained 
by  two  successive  rotations  of  7 1\  in  the  following  manner.  First,  rotate  7 1\  around 
its  x  axis  by  an  angle  —  S*’1  to  get  the  coordinate  system  and  then  rotate  7 1\ 
around  its  y  axis  by  an  angle  to  obtain  7)*.  The  rotation  matrix  from  7 1\  to  Ttl 
is  given  by 


cos  8\'1 

cx.i  •  cy.i 

sin  0-f-  sin  Oj- 

cx.i  •  cy, 

cos  dj-  sm  Of- 

0 

ex.i 

COS  Oj- 

sm 

—  sin  ’* 

•  srx.i  zViT 

sin  ot  cos  dt 

srx.i  ?y. 

COS 0t’  COS  01 
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For  small  <5f’®  and  5\'1 ,  this  matrix  can  be  approximated  by 


1 

0 


o  sr 


-8fl  <J*’*  1 


Let  a  =  [a \  02  and  define  the  operator  [-]x  such  that 


LaJx  = 


0  — O3  <3-2 

03  0  — a.  1 

— 02  c  1  0 


Then,  (5.1)  can  be  expressed  in  the  compact  form1 


(5.1) 


A;  = 7  +  [cT<s;]  ,<  (5.2) 

where  /  is  the  3x3  identity  matrix,  5\  =  ,  and  the  2x3  matrix  /*  is 

defined  by 


/* 


1  0  0 
0  1  0 


In  the  optical  transceiver  of  Figure  5.1,  for  the  ideal  case  that  the  axes  of 
the  lens  and  the  laser  source  are  perfectly  aligned,  we  have  5\  =  0;  however,  as 
mentioned  in  Section  5.2.3,  due  to  an  imperfect  manufacturing  process,  there  might 
be  a  small  angle  between  the  lens  axis  and  the  laser  axis.  This  can  be  modeled  by 
letting  5\  =  elt  in  (5.2),  where  the  elements  of  elt  are  the  misalignment  errors  in  x 


^Afterward,  we  consider  (5.1)  as  an  exact  formula. 
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and  y  directions  (elevation  and  azimuth). 

For  the  optical  transceiver  of  Figure  5.2,  5\  depends  on  the  orientations  of 
the  tracking  and  point-ahead  mirrors  which  are  described  by  the  2-dimensional 
vectors  a\  and  (31,  respectively.  These  vectors  are  defined  as  follows.  Let  Q  be  a 
vector  normal  to  the  tracking  mirror  %  and  define  Q  such  that  Q  =  Q  when  the 
actuators  of  the  mirror  are  in  the  “rest”  condition.  Then,  the  elements  of  a\  are 
the  deviation  (angles)  of  Q  from  in  x  and  y  directions.  Since  the  elements  of  a\ 
and  (3(  are  small,  their  contributions  to  5\  appear  linearly,  i.e.,  we  can  write 

%  =  +  m  +  e\ 

where  K  and  L  are  2x2  known  matrices.  Replacing  this  result  into  (5.2),  we  express 
the  rotation  matrix  from  1Z\  to  T(  as 

&]  =  I+[I?(Kal  +  LPi  +  El)}x.  (5.3) 

Note  that  this  expression  can  be  used  for  the  transceiver  of  Figure  5.1  as  well,  by 
setting  a,\  =  (3lt  =  0. 

Regarding  the  optical  transceiver  of  Figure  5.1,  we  define  the  2-dimensional 
coordinate  system  Vlt  in  the  focal  plane  of  the  lens  (also  the  photodetector  plane) 
such  that  its  center  is  located  at  the  focus  of  the  lens  and  its  x  and  y  axes  are 
parallel  to  x  and  y  axes  of  7 Z\,  respectively.  For  the  optical  transceiver  of  Figure  5.2, 
the  coordinate  system  V\  is  defined  in  the  plane  of  the  photodetector  surface  such 


116 


that  its  origin  is  located  at  the  projection  of  the  focus  of  the  secondary  mirror  on 
the  photodetector  surface,  its  x  axis  is  parallel  to  the  x  axis  of  lZlt,  and  its  y  axis  is 
perpendicular  to  its  x  axis. 

5.3.2  Optical  Field  on  the  Transceiver  Aperture 

Consider  the  coordinate  system  O  with  axes  (ax,  ay,  az)  and  assume  that  a  laser 
beam  with  a  wavelength  A  and  a  divergence  angle  propagates  along  az.  Since 
the  major  fraction  of  the  laser  power  is  concentrated  in  its  fundamental  transverse 
mode  (TEMqo),  we  shall  use  a  Gaussian  beam  model  for  the  optical  held  generated 
by  the  laser.  Based  on  this  model,  the  complex  amplitude  of  the  optical  held  at  a 
point  r  =  (x,y,  z) ,  z  >  0  is  given  [23]  by 

{x2  +  y2)  |  (5-4) 

where  P  >  0  is  the  laser  power,  k  —  2n/\  is  the  wave  number,  and  w  (z)  and  R(z) 
are  defined  as 


U  (r)  = 


7T  W  (z) 


exp  <  — jkz  — 


wz  (z) 


+ 


jk 


2  R(z 


w(z)  =w0(l  +  (z/z0)2)l/2 
R(z)  =  z  (l  +  ( z0/z  f) 

with1  w0  =  X/'K'ijj  and  z0  =  A/w ^2. 

Suppose  that  r  can  be  decomposed  as  r  =  f  +  Sr  such  that  f  =  (x,  y,  z) 

and  Sr  =  ( 5x,Sy,5z )  satisfy  the  conditions  ||<5r||/||f||  <C  and  \8z\  <^z0<^.  ||f  ||. 
spite  of  our  convention,  here,  the  subscript  0  is  not  used  as  time. 
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Denote  by  (f,  az)  the  angle  between  the  vectors  f  and  az  and  assume  that  the 
magnitude  of  (r,  az)  is  in  order  of  -0  or  smaller,  i.e.  (r,  az)  <  0.  Then,  for  a  small  0 
(e.g.,  smaller  that  1  mrad),  the  optical  held  (5.4)  can  be  approximated  by 


where  •  denotes  the  dot  product  operator  and  0  is  defined  as 


0  =  kz  +  k 


x 2  +  y2 
2  R(z) 


In  the  context  of  free-space  optics,  f  represents  the  line-of-sight  of  the  receiver 
with  respect  to  the  transmitter  and  Sr  is  the  position  vector  of  a  point  on  the 
aperture  plane  of  the  receiver  with  respect  to  the  center  of  the  aperture.  With  this 
assignment,  |5^|/||hr||  will  be  in  the  order  of  the  pointing  error  (f,az).  Table  5.1 
presents  some  typical  values  of  the  parameters  of  a  free-space  optical  link  which  are 
relevant  to  approximation  (5.5).  Note  that  in  Table  5.1,  it  is  assumed  that  under 
the  closed- loop  regime,  the  pointing  error  is  in  order  of  1/10  of  the  angular  spread  of 
the  beam.  According  to  this  table,  the  conditions  for  approximating  (5.4)  with  (5.5) 
are  satisfied  for  all  scenarios,  i.e.,  short,  medium,  and  long  range  applications. 

Note  that  0  on  the  right  side  of  (5.5)  introduces  a  constant  phase  over  the 
aperture  of  the  receiver,  which  can  be  dropped  without  affecting  our  discussion.  In 
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Parameter 

Symbol 

Short  Range 
(terrestrial) 

Medium  Range 
(airborne) 

Long  Range 
(intersatellite) 

Wavelength 
Angular  Spread 
Range 

Aperture  Diameter 
Pointing  Error 

A 

2ip 

IMI 

2||*r|| 

(■ r,az ) 
zo 

1  <M 

1  /rrn 

500  —  3000  yr&d 
0.1  —  5  km 

5  —  20  cm 

50  —  300  yx&d 
0.15-5  m 
order  of  /tm 

1  fim 

100  ^rrad 

10  —  100  km 

20  cm 

10  y rad 

125  m 
order  of  y m 

1  ym 

1  —  50  yr&d 
1000  -  80000  km 
20  —  100  cm 

0.1  —  5  ^rad 

500  m  -  1250  km 
order  of  ym 

Table  5.1:  Typical  values  of  the  parameters  of  a  free-space  optical  link.  The  data  is 
gathered  from  [47,  3,  50]. 


addition,  Table  5.1  indicates  that  for  long  range  applications  we  have 


Sx2  +  Sy2 
2 1 1  r  1 1 


<  1 


(5.6) 


which  allows  us  to  simplify  (5.5)  as 


U  (r  +  Sr)  ~ 


(r,az)2\ 

r  ) 


exp 


(5.7) 


Although  (5.7)  is  not  a  good  approximation  for  short  and  medium  range  applica¬ 
tions,  still  it  can  be  used  for  these  cases,  since  the  phase  (5.6)  can  be  compensated 
by  proper  adjustment  of  the  distance  between  the  lens  and  the  photodetector  sur¬ 
face  [51]. 

We  use  approximation  (5.7)  to  determine  the  optical  held  on  the  aperture  of 
the  stations.  For  this  purpose,  consider  the  position  vector  of  a  point  on  the  aperture 
plane  of  transceiver  i  with  respect  to  the  center  of  the  aperture  and  let  /JV,  s*  G  M2 
be  its  representation  in  the  coordinate  system  1Z\.  Then,  the  representation  of  this 
vector  in  X  is  given  by  (J*f2j)  s\  We  define  the  2-dimensional  (tracking  error) 
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vectors  0“  and  6\  as 


Ot 

0bt 


W  {rt~  rt ) 

(rt  -  rt) 


(5.8) 


In  order  to  find  the  optical  field  on  the  aperture  of  station  b,  we  replace  r  with  rb  —  r“ 
and  5r  with  (/*f^)Ts6  in  (5.7).  This  is  equivalent  to  replacing  f  •  <5r/||f ||  with  —9b-sb. 
To  obtain  a  proper  replacement  for  (f,  a2),  we  must  take  into  account  the  travel  time 
of  light  between  the  stations.  For  this  purpose,  we  define  the  2-dimensional  (pointing 
error)  vectors  and  ijjb  as 


rt  = 


rt  = 


VW  (r‘+td  -  r“+(J 


r 


£+£<2  ^~H<2 

,a 

t+td 


(rt \td  ~  rbt+td ) 


r 


£+^c2 


r 


£+£<2  I 


(5.9) 


Then,  it  is  easy  to  show  that  ( f,az )2  must  be  replaced  by  In  a  similar 

manner,  we  can  find  proper  replacements  for  determining  the  optical  field  on  the 
aperture  of  station  a. 

Let  U\  (s)  denote  the  complex  amplitude  of  the  optical  field  at  a  point  s  on 
the  aperture  of  station  i.  Then,  using  the  replacements  mentioned  above,  we  can 
write 


exp  (  -  (||V’t-J|  /^)2  )  exp (jkO?  ■  s ) 


exp  (-  /^)2)  exp (jk6b  •  s) 


(5.10) 
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where  P/  ^  0  is  the  instantaneous  power  transmitted  by  station  i.  It  can  be  observed 
from  (5.10)  that  the  optical  intensity  at  station  b  depends  on  the  pointing  error 
of  station  a,  while  the  phase  depends  only  on  the  tracking  error  6\  of  station  b. 

Note  that  in  (5.10),  we  allow  P\  to  be  time-dependent  in  order  to  describe 
the  information-bearing  signal  which  modulates  the  optical  power  of  station  %.  The 
nature  of  the  information-bearing  signals  suggests  that  a  nonnegative  stochastic 
process  is  an  appropriate  means  for  modeling  P).  The  statistical  properties  of  this 
stochastic  process  depends  on  the  type  of  modulation  scheme  and  channel  coding; 
however,  for  many  applications,  the  detailed  characterization  is  not  necessary  and 
only  a  few  parameters  (e.g.,  expected  value  and  coherence  time)  and  the  knowledge 
of  general  properties  (e.g.,  stationarity)  of  the  process  are  enough  to  use  the  model. 


5.3.3  Optical  Intensity  on  the  Photodetector  Surface 

The  optical  held  on  the  focal  plane  of  a  thin  lens  can  be  determined  from  Fraun¬ 
hofer  diffraction  [41,  51].  Consider  the  optical  transceiver  of  Figure  5.1  and  assume 
that  the  optical  held  U\  (s)  on  the  receiving  lens  is  given  by  (5.10).  According  to 
Fraunhofer  diffraction,  the  optical  held  on  the  x  —  y  plane  of  V\  is  given  [41,  51]  by 


u: 


d.i 


[S  = 


:j\f, 


exp 


Ul(s)ex  p 


k 

-JTs-s 
J  c 


ds 


(5.11) 
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where  fc  and  g  are  the  focal  length  and  the  radius  of  the  (circular)  lens,  respectively. 
In  terms  of  (5.11),  the  optical  intensity  on  the  photodetector  surface  is  given  by 


(5.12) 


Let  J\  (•)  be  a  Bessel  function  of  the  first  kind  and  define  the  intensity  pat¬ 


tern  7 


as 


7  00  = 


JHkg\\s\\/fc 


vr  s 


(5.13) 


Then,  using  (5.10),  (5.11),  and  (5.12)  and  defining  y\  =  fc6\,  we  determine  the 
optical  intensities  /“  (s)  and  Ip  (s)  as 


It  00 


2  g2Pl 


t-td 


'• 0 2 


2  p2P,a_ 


t-td 


ip2 


exp 


exp 


(— 2  (ll^t-tdll  /^)2)  ~  Vt) 

(— 2  (ll^-td  II  /V')2)  t(s  —  Ut)  ■ 


(5.14) 


For  the  optical  transceiver  of  Figure  5.2,  the  entire  telescope  is  considered  as 
a  “big”  lens  with  a  focal  length  fc.  Also,  the  effect  of  the  tracking  mirror  on  the 
optical  intensity  I)  (s)  will  be  included  by  modifying  y\  as 


vt  =  SA  +  Hai 


(5.15) 


where  H  is  a  known  2x2  matrix.  The  linearity  of  (5.15)  is  justified  by  the  assump¬ 
tion  that  ||aj||  is  small. 

In  order  to  simplify  our  analysis  of  Chapter  6,  it  is  desirable  to  approximate 
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the  intensity  pattern  7  (s)  with  a  Gaussian  function,  i.e., 


7  (s)  -  $2  (r;0,R) 


(5.16) 


where  R  —  2  ( fc/kg )2  /2X2-  The  comparison  between  (5.13)  and  (5.16),  illustrated  in 
Figure  5.4,  indicates  that  (5.16)  is  a  reasonably  close  approximation  for  (5.13).  Note 
that  both  7  (s)  and  its  Gaussian  approximation  approach  the  unit  impulse  5  (||s||), 
as  kg/  fc  — >  00. 


Figure  5.4:  Comparison  of  7  (s)  with  its  Gaussian  approximation. 


5.3.4  Atmospheric  Turbulence 

The  optical  intensity  (5.14)  was  determined  based  on  the  assumption  that  the  laser 
beam  propagates  through  free-space  (vacuum).  While  this  assumption  is  justified 
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for  intersatellite  applications,  for  communication  through  atmosphere,  (5.14)  must 
be  modified  to  include  the  effect  of  atmospheric  turbulence. 

Atmospheric  turbulence  which  is  caused  by  differential  heating  of  the  air,  re¬ 
sults  in  random  variations  in  the  refractive  index  of  air.  This,  in  turn,  causes 
random  fluctuations  in  the  intensity  and  phase  of  the  received  optical  field.  The 
refractive  index  of  air  as  a  function  of  the  position  vector  r  and  time  t  can  be  mod¬ 
eled  as  nt[r)  —  n  +  5nt(r),  where  n  is  a  constant  and  {Snt(r)}  is  a  stochastic 
field.  The  statistical  properties  of  {5rit  (r)}  can  be  derived  from  the  Kolmogorov 
theory  [52,  30,  26], 

The  Rytov  method1  is  frequently  used  to  analyze  and  model  the  propagation 
of  an  optical  field  in  the  turbulent  atmosphere  [30,  26].  In  this  method,  the  complex 
amplitude  of  the  optical  field  is  expressed  as 

Ut(r)  =  Tt(r)Ut(r)  (5.17) 

where  Ut(r )  is  the  optical  field  under  the  condition  5nt(r)  =  0  and  (T)  (r)}  is  a 
stochastic  field  which  can  be  determined  in  terms  of  {Srit  (r)}  and  Ut  (r). 

In  general,  using  (5.17)  in  obtaining  an  expression  for  (r)  leads  to  a  com¬ 
plicated  calculation  which  is  beyond  the  scope  of  this  study.  For  short  range  appli¬ 
cations  (in  order  of  1  km  for  weak  to  moderate  turbulence)  in  which  the  diameter 
of  the  receiving  aperture  is  smaller  than  the  turbulence  coherence  length2,  the  sto- 

1Tliis  method  provides  an  approximate  solution  to  the  Maxwell  wave  equation. 

2Roughly  speaking,  the  turbulence  coherence  length  (Fried  parameter)  is  the  maximum  dis¬ 
tance  between  two  points  rq  and  ?’2  in  which  Tt  (?’i)  and  Tt  R2)  are  highly  correlated.  For  a  detailed 
characterization  of  this  parameter  see  [52,  30]. 
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chastic  field  {Tt  (r)}  is  approximately  uniform  over  the  aperture.  Therefore,  {Tt  (r)} 
can  be  approximated  by  a  stochastic  process.  By  means  of  this  approximation,  we 
modify  (5.14)  as 


It  (*) 


2 

||  rt  ~  rt  | 

2  Q2^Pta-td 

"02  r\  —  r“ 


exp  (-2  (|p?_  J /<T)  7 (s~yi) 
exp  (-2  (||^f_J|  /</>f)  7(s  -  vi) 


(5.18) 


where  {«“}  and  {k?}  are  nonnegative  stochastic  processes. 

Under  the  condition  of  approximation,  is  a  unit  mean  lognormal  stochastic 
process  [30,  26].  The  variance  of  k\  (for  a  fixed  t )  depends  on  the  wavelength  of 
the  light,  the  propagation  distance,  the  refractive- index  structure  constant,  and  the 
shape  of  the  optical  held  [30].  Although,  a  complete  description  for  the  temporal 
evolution  of  {ttlt}  does  not  exist,  its  autocorrelation  function  can  be  approximated 
using  the  Taylor’s  frozen-how  hypothesis  [52], 


5.3.5  The  Photodetector  Output 

In  order  to  model  the  position-sensitive  photodetectors,  we  hrst  introduce  the  space- 
time  rate  \\  (r)  which  is  an  affine  function  of  the  optical  intensity  Pt  (r),  i.e., 


K  (r)  =  nil  (r)  +  A. 


(5.19) 
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Here,  the  known  constant  r)  >  0  is  the  photodetector  sensitivity1  and  A  ^  0  is 
a  known  constant  which  characterizes  the  combination  of  the  dark  current  noise 
and  the  back  ground  radiation  [29].  Substituting  I\  (r)  from  (5.18)  into  (5.19),  we 
express  A“  (r)  and  \bt  (r)  as 


K  (r)  =  ”t-td  exP  (-2  (|  W-t  J  /^Y)  l(r  ~  Vt )  +  A 

A*  ( r )  =  v\_td  exp  (-2  (||$Ltd  ||  /V>)2)  l(r  -  Vt)  +  A 


(5.20) 


where  and  z/fb  are  defined  as 


zy  = 


2  T)Q  k[ 


.2  pfe 

f+td^t 


^2 


<Y>  er  _  p 

| '  t+td  '  t+td  | 


z/.  = 


2  rjg  k' 


2,„a  pa 
t+tdrt 


if)"2 


<Y> u  _  p  cr 

I  ^+^(2  I 


Note  that  z^/77  is  the  instantaneous  optical  power  received  by  station  i  in  the  absence 
of  the  pointing  error. 

We  denote  the  set  of  points  on  the  surface  of  the  photodetector  by  A.  In  a 
position-sensitive  photodetector,  A  is  partitioned  into  q  subsets  Ak,  k  =  1,2,  ■  ■  ■  ,  q 
such  that  Uqk=1Ak  =  A.  The  output  of  photodetector  i  is  a  q-  dimensional  vector  Yfl 
such  that  its  kth  element  Yk'L  is  the  output  of  the  region  Ak.  We  model  Yk'1  as  a 
doubly  stochastic  Poisson  process  with  the  rate  process  where  Ak,t  is  given  by 


A*’*  =  /  A j  (r)  dr. 
J  Ak 


(5.21) 


1The  photodetector  sensitivity  is  given  by  rj  =  (,/hf ,  where  H  is  the  Planck’s  constant,  /  is  the 
mean  frequency  of  light,  and  0  <  £  ^  1  is  the  quantum  efficiency  of  the  photodetector. 
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Moreover,  for  l  7=  k,  conditioned  on  {A,  }  and  {A,1},  the  stochastic  processes 
{U  '“}  and  {y!*}  are  mutually  independent. 

Remark  5.3.1.  Let  |,4fc|  denote  the  area  of  region  k.  Then,  under  the  condition 


q  — >  oo  and  max  — >  0, 

ke{l,2,-,q}  1  1 

the  vector-valued  stochastic  process  Ytl  tends  to  a  space-time  point  process  with 
rate  (5.20).  This  is  a  motivation  for  approximating  the  output  of  a  high  spatial 
resolution  photodetector  by  a  space-time  point  process. 

Remark  5.3.2.  A  first  order  approximation  for  a  Poisson  process  is  its  expected 
value.  For  the  (/-dimensional  vector  Ytl,  this  approximation  can  be  expressed  as 
Ytl  ~  A\,  where  AJ  is  a  (/-dimensional  vector  with  elements  A*’*.  This  approximation 
can  be  improved  as 

Yi  Aj  +  n‘  (5,22) 

where  the  noise  vector  n\  is  independent  of  A).  This  simple  approximation  is  useful 
when  a  deterministic  approach  is  adopted  to  study  the  system  [41,  21]. 


5.3.6  Dynamical  Equations 


Let  uf\  u“’\  and  ut'1  denote  the  input  (control)  vectors  of  the  pointing  assembly,  the 
tracking  mirror,  and  the  point-ahead  mirror,  respectively.  Define  the  disturbance 
vector  p\  as 


Pt  =  ~Pt  = 
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The  goal  of  this  section  is  to  develop  a  state-space  model  which  determines  the 
output  vector  (y),  ipl)  in  terms  of  the  control  vector  u\  =  (upt  u^’\  rtf’1)  and  the 
disturbance  vector  (u^,  p\,  e\).  We  note  that  (5.8)  and  (5.15)  express  y\  in  terms 
of  pi,  Q),  and  Also,  can  be  determined  in  terms  of  p\,  e\,  Qlt,  alt,  and 
using  (5.3)  and  (5.9).  Thus,  we  need  to  obtain  dynamical  equations  which  describe 
the  temporal  evolution  of  Q),  a\,  and 

Referring  to  the  definition  of  vlt  and  ujlt  in  Section  5.3.1,  we  can  show  that  f 1\ 
is  the  solution  of  the  matrix  differential  equation 

si;  =  [«;]  x  sr,  +  sr,  p,‘]  x . 

Here,  the  angular  velocity  vector  v\  G  M3  is  controlled  by  the  pointing  assembly.  In 
the  most  general  case,  the  relationship  between  v\  and  up’1  can  be  described  by  the 
nonlinear  state-space  equations 


if  =  /pf,<-‘) 
vi  =  9 Pf ) 


(5.23) 


where  xp'1  G  M"P  is  the  state  vector  and  /  (•)  and  g  (•)  are  smooth  vector  fields  with 
proper  dimensions.  The  explicit  forms  of  /  (•)  and  g  (•)  depend  on  the  structure 
of  the  pointing  assembly  and  will  not  be  discussed  here.  For  a  two-axes  gimballed 
pointing  assembly,  the  control  vector  up’1  is  2-dimensional. 

We  model  the  dynamics  of  the  steerable  flat  mirrors  by  the  linear  state-space 
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equations 


xf  =  Aaxf  +  Bauf 
a't  =  Caxf 


and 


if  =  Aex  f  +  BV/ 

13 i  =  c»4-' 


(5.24) 


where  the  superscripts  a  and  /3  refer  to  the  tracking  and  point-ahead  mirrors,  respec¬ 
tively.  In  these  equations,  G  Mn“  and  xf’*  G  Rn/3  are  the  state  vectors,  G  M2 
and  uf ’*  G  M2  are  the  control  vectors,  and  the  matrices  have  appropriate  dimensions. 
The  linearity  of  the  equations  is  justified  by  the  fact  that  the  flat  mirrors  operate 
over  small  angles.  It  is  worth  remarking  here  that  the  actuators  of  the  flat  mirrors 
are  fast  dynamical  systems,  so  that  for  an  approximate  analysis,  we  can  ignore  their 
dynamics  and  approximate  a\  ~  and  PI  ~  vPt'1 . 

As  a  summary,  we  characterize  each  station  as  a  dynamical  system  with  the 
state  vector  [x^%l,  x“’®,  xf’*) ,  the  control  vector  up1,  uf’*) ,  and  the  distur- 
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bance  vector  {ult,  plt)  elt),  where  the  state  vector  evolves  in  time  according  to 

ni=[g{xr)]x^  +  ^  Mx 

(5.25) 

xf  =  Aaxf  +  Bauf 
x\ 8j  =  A0x04  +  B0u0’\ 

Also,  the  output  vector  ( y\ ,ipl)  is  expressed  in  terms  of  the  state  and  disturbance 
vectors  as 


y\  =  fJM.pl  +  HCaxf 


rt  =  h  i+ 


If  ( KCaxaj  +  LC0x0j  +  e\ 


(5.26) 


^t.Pt+td- 


5.3.7  Model  Summary  and  Discussion 

The  mathematical  model  developed  in  this  section  is  summarized  in  the  block  dia¬ 
gram  of  Figure  5.5.  This  block  diagram  describes  a  dynamical  system  with  the  input 
vector  (uf,  uf) ,  the  disturbance  vector  uj\ ,  p“,  p\,  e\,  i/“_t >  ut-td)  >  and  the  out¬ 

put  vector  ( Yta,Ytby  Here,  the  output  vector  (Yta,Yf)  “statistically”  depends  on 
the  state  of  the  system  and  the  disturbance  vector.  According  to  Figure  5.5,  except 
for  ijj°t  and  which  appear  in  the  state-space  equations,  other  elements  of  the  dis¬ 
turbance  vector  only  appear  in  the  output  equations.  Therefore,  p\,  e],  and  i/£_t  , 
%  =  a,  b  affect  the  state  vector  only  after  establishing  closed-loop  paths  from  Ytl 
to  u\.  Also,  we  observe  from  Figure  5.5  that  when  the  feedback  loops  exist,  the 
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Figure  5.5:  Block  diagram  of  a  cooperative  optical  beam  tracking  system.  In  this  figure, 
the  blocks  marked  by  “State-Space  Equations” ,  “Output  Equations” ,  and  “Optical  Inten¬ 
sity  Model”  refer  to  (5.25),  (5.26),  and  (5.20),  respectively.  Also,  “Photodetector  Model” 
refers  to  (5.21)  and  the  vector-valued  doubly  stochastic  Poisson  process  Ytl  defined  in 
Section  5.3.5. 


subsystems  a  and  b  are  coupled  through  (5.20)  (“Optical  Intensity  Model”  block  in 
Figure  5.5)  and  the  linear  constraint  PtJrp\  =  0. 

In  a  cooperative  optical  beam  tracking  system,  the  goal  of  the  closed-loop  con¬ 
trol  is  to  maintain  some  appropriate  norm  of  {y\\  and  {ifil}  close  to  zero.  Here,  the 
condition  y\  ~  0  is  the  objective  of  the  optical  beam  tracking  operation,  while  ~  0 
is  associated  with  the  active  pointing.  In  short  range  applications  with  t([  —  0,  as¬ 
suming  that  e\  =  0,  the  two  conditions  are  equivalent,  i.e. ,  the  tracking  and  pointing 
operations  are  combined.  In  long  range  applications  where  ta  ^  0,  the  conditions  can 
be  achieved  independently,  by  means  of  the  additional  degree  of  freedom  provided 
by  the  point-ahead  mirrors. 
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The  solution  to  the  control  problem  above,  depends  on  the  structure  of  the 
information  which  is  provided  to  the  controllers.  In  one  scenario,  the  controller  of 
each  station  has  access  only  to  the  output  of  the  same  station,  i.e.,  the  control  prob¬ 
lem  is  decentralized.  In  another  possible  scheme,  the  stations  share  their  outputs 
through  the  optical  link  or  and  independent  RF  channel.  This  scheme  improves  the 
performance  of  the  system  by  increasing  the  information  for  estimating  the  common 
disturbance  =  —  pbt.  On  the  other  hand,  sharing  the  outputs  is  the  only  possibil¬ 
ity  for  compensating  e\,  since  this  disturbance  vector  can  be  observed  only  by  the 
receiving  station.  As  an  assisting  equipment,  an  arrangement  of  three  (perpendicu¬ 
lar)  rate  gyros  and  accelerometers  which  is  carried  by  each  station  provides  relevant 
information  for  estimating  oj\  and  p\. 

Up  to  this  point,  we  did  not  offer  any  model  for  the  disturbance  vectors,  while 
such  a  model  is  essential  for  the  further  analysis.  A  disturbance  model  might  be 
deterministic  or  stochastic,  depending  on  the  preferred  analysis  tools  and  methods. 
For  a  deterministic  analysis,  the  disturbance  vectors  are  modeled  using  appropriate 
deterministic  functions  which  are  rich  enough  to  represent  the  family  of  all  possible 
instances.  In  this  type  of  analysis,  we  can  also  approximate  the  output  vector 
by  (5.22). 

A  stochastic  disturbance  model  can  be  characterized  by  an  appropriate  sto¬ 
chastic  state-space  equation  driven  by  a  vector-valued  Wiener  process.  Then,  a 
complete  stochastic  model  is  constructed  by  combining  this  equation  with  (5.25). 
This  stochastic  modeling  approach  will  be  considered  in  Section  5.5. 
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5.4  Linearizing  the  Dynamical  Equations 


In  the  applications  such  as  intersatellite  communication,  the  relative  motion  (dis¬ 
turbance)  consists  of  a  large,  predetermined  component  and  a  small,  unknown  term. 
Accordingly,  the  control  law  consists  of  an  open-loop,  coarse  control  and  a  small, 
closed-loop,  fine  control.  In  this  case,  the  nonlinear  state-space  equations  (5.25) 
and  the  output  equations  (5.26)  can  be  linearized  around  a  predetermined  nominal 
trajectory,  which  results  in  a  linear  time-varying  model  for  the  fine  control  regime. 

In  order  to  obtain  the  linearized  model,  every  vector  contributing  in  (5.25) 
and  (5.26)  will  be  expressed  as  the  sum  of  a  nominal  vector  and  a  (small)  deviation 
vector.  For  a  vector  x,  the  nominal  and  deviation  vectors  will  be  denoted  by  x 
and  Sx,  respectively.  Thus,  we  express  x  as  x  =  x+Sx.  Because  the  tracking  mirrors 
are  not  involved  in  the  coarse  control  regime,  we  set  ol\  =  0.  In  this  procedure,  p\ 
needs  a  special  attention,  since  it  must  satisfy  the  condition  ||pj||  =  1.  This  norm 
condition  is  obviously  satisfied  for  the  nominal  vector  p\  defined  as 


b  rt  ~  rt 


Pt  ~  Pt  1 1 


'  t  '  t 


We  define  the  deviation  vectors  5^  and  5$  as 


?ea  reb  ®'t  O't 

t  ~  t  ~  II  -h  -n  II 


'  t  '  t 


Then,  assuming  that  ||5£J||2  ^  1?  we  can  show  that  the  condition  \\p\  +  8p\ ||  =  1  is 


133 


approximately  preserved,  if  the  deviation  vector  5p\  is  given  by 

sA=(i-&(&)t)s&  <5’27) 

The  rotation  matrix  VL\  will  be  expressed  as  ~  (/  +  [<5$]x)f2J,  where  S(plt  is 
a  3-dimensional  vector  and  Vt\  is  the  nominal  rotation  matrix  satisfying 

ft;  =  [«;]  x  ft; + ft;  mi  x  . 

We  can  show  that  S(j)lt  is  the  solution  of  the  linear  differential  equation 

S4>\  =  [cf]  x  S4>\  +  Sv\  +  nj&4 

The  goal  of  the  open-loop  control  (hf’*,  hf’*)  is  to  maintain  y\  =  0  and  ^  =  0. 
These  conditions  are  respectively  equivalent  to  f Ttplt  =  [0  0  1]T  and  Ajflj  = 

where  AJ  is  defined  as 

a  i  =  i+[y(m+4)]x. 

We  can  show  that  y\  —  0  holds,  if  and  only  if  v\  satisfies 

hv\  =  -iSWt  +  JlMPt  (5-28) 
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where  the  2x2  matrix  J  is  defined  as 


Also,  solving  the  equation  A =  Q\+td  for  we  find 

K  =  ( JL y1  IM+,A  -  L-'t,  (o.29) 

which  is  the  condition  on  that  leads  to  =  0.  We  shall  assume  that  the  state- 
space  equations  (5.23)  and  (5.24)  allow  upt ’*  and  hf’*  to  achieve  the  conditions  (5.28) 
and  (5.29). 

Let  x t’1  denote  the  state  of  (5.23)  under  the  nominal  control  up’1.  We  linearize 
the  state-space  model  (5.23)  around  xp'1  and  up'1  to  obtain 

8xf  =  Ap/5xf  +  Bf8u\ ’•* 

8v\  =  crsx^ 

where  Ap,t,  Bp’\  and  Cf’®  are  defined  as 

ap,%  =  df  ( x ,  u) 

dx  x=5it  u=Ut ’* 

BP,i  =  df  ix,u) 

d  U  x=x^’i ,  u=uf’i 

CP,i  =  dg(x ) 

dx  x=Xt’x 

As  a  summary,  we  present  the  linearized  version  of  (5.25)  through  the  following 
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state-space  equations 


8xf  =  Ap/8xf  +  Bp/8up/ 

=  [vl\  x  Sti  +  CfSxf  +  Q\54 
8xf  =  Aa8xf  +  Ba8uf 
8x ?’*  =  Ap8xp/  +  B^8up/. 


(5.30) 


Assuming  that  (5.28)  and  (5.29)  hold,  we  linearize  (5.26)  around  the  nominal  tra¬ 
jectory  to  get 


vi  =  fc  (JIM,  +  U-KM)  +  HCTSxf 

4<i  =  JI.AM  +  Ij W5+,,  -  (r  x  J.T)  (kc°6x?‘  +  LCHM  +  &;) , 

(5.31) 

We  note  that  the  linearized  model  is  not  identical  for  i  —  a,b,  as  it  depends  on  the 
nominal  trajectory  which  is  different  for  stations  a  and  b.  When  (5.30)  is  used  to 
describe  the  transceiver  of  Figure  5.1,  the  control  vectors  ti“’1  and  are  identically 
zero.  For  the  transceiver  of  Figure  5.2,  since  the  pointing  assembly  is  used  only  for 
the  open-loop  control,  we  set  8up’1  =  0  in  the  first  equation  of  (5.30). 


5.5  Stochastic  Model 

As  mentioned  in  Section  5.3.7,  the  disturbance  vector  (ojlt)  p\,  e\)  can  be  adequately 
described  in  a  stochastic  framework.  Since  our  analysis  in  Chapter  6  focuses  on  the 
linearized  model  of  Section  5.4,  the  goal  of  this  section  is  to  develop  a  stochastic 
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description  for  the  linearized  disturbance  vectors  8ultl  8Q,  and  8e\. 

We  use  a  set  of  stochastic  differential  equations  to  model  the  disturbance 
vectors  8col,  8Q,  and  8e\.  Justihed  by  the  assumption  that  these  “deviation”  vec¬ 
tors  have  small  norms,  we  consider  a  linear  structure  for  the  equations,  i.e.,  we 
describe  8u>l,  8£{,  and  8e\  by1 


I  n0J,irhnu’’i 

—  -/i£  LLL  CILU^ 

r  i  /-yuj.i  uj.i 

8ujt  =  Ct  xt  , 


(5.32a) 


dxf1  =  Afxfldt  +  Dfdwf1 
6 (t  =  C\x\\ 


(5.32b) 


dx?  =  A?x\ 


ldt  +  Dfdwf 


(5.32c) 


8e\  =  C£/xe/ 

where  x“'1  G  Mn“,  xfl  G  Mn?,  and  xet ’*  G  Mn£  are  the  state  vectors  and  w^'1  G 
Mpu",  G  Rp  ,  and  w£t)l  G  Mp£  are  vector- valued  standard  Wiener  processes. 
Here,  the  matrices  are  uniformly  bounded  and  have  appropriate  dimensions.  The 
initial  states  Xq’*,  Xq*,  and  Xq*  are  Gaussian  random  vectors  with  known  mean 
and  covariance  matrix.  Since  8uj\,  8Q,  and  8e\  have  independent  physical  ori¬ 
gins,  the  initial  states  Xq’1,  Xq’*,  and  Xq‘  and  the  Wiener  processes 
and  are  assumed  to  be  statistically  independent.  Also,  the  initial  states 

and  Wiener  processes  associated  with  stations  a  and  b  are  statistically  independent, 


1The  equations  are  defined  in  the  Ito  sense. 


137 


except  for  the  pairs  (xg  £g 6)  and  (wf wf’6)  that  must  satisfy  a;g’a  +  Xq b  =  0 
and  wfa  +  wf ,h  =  0,  t  ^  0,  in  order  to  hold  =  0. 

The  state-space  equations  (5.30)  and  (5.32)  and  the  output  equations  (5.31) 
can  be  combined  in  a  compact  form1 


dx\  =  A\x\  +  B\u\dt  +  D\dw\ 

y\  =  cx 

#  =  ('0/2 )  L& 


(5.33a) 

(5.33b) 

(5.33c) 


where 


A  = 

'dxf 

r  a.i  c  B.i  uj.i 

0Xt  OXt  xt 

4  = 

'Sxf 

r  a.i  c  B.i  uj.i 

0Xt  OXt  xt 

A  = 

~Suf 

c  Oi . 

6ut 

'  Su?-‘]T 

w\  = 

uj.i 

wt 

£,i 

w 

1  T 

£,l 

Wt 

(5.34) 


and  the  block  matrices  {Alt,  B\,  D\,  C\,  L\}  can  be  obtained  in  terms  of  the  matrices 
appearing  in  (5.30),  (5.31),  and  (5.32).  Here,  the  initial  state  xl0  is  a  Gaussian 
random  vector  with  mean  xlQ  and  covariance  matrix  Eg.  Considering  (5.34),  we  can 
determine  matrices  n  and  1 C  such  that 


z\  =  Yix\  +  Wx\ 


(5.35) 


1  The  notation  u\  which  is  used  here  for  the  linearized  model  should  not  be  confused  with  the 
same  notation  used  in  Section  5.3.6  for  the  nonlinear  case. 
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Also,  we  note  that  the  linear  constraints  Xq “  +  x^1’  =  0  and  w^’a  +  wf ,b  =  0  can  be 
expressed  as 


Tx(xa0  +  xb0)  =0 
Tw{wat  +  w\)  =  0 


(5.36) 


with  properly  defined  matrices  Y*  and  Tw. 

In  the  following,  we  discuss  some  properties  of  matrices  B\,  Clt)  and  L\  which 
will  be  used  later  in  Chapter  6.  Assume  that  8e\  is  identically  zero.  Then,  regarding 
the  transceiver  of  Figure  5.1  for  which  =  0,  8xy  =  0,  and  td  =  0,  we  observe 
from  (5.31)  that  y\  =  Comparing  this  result  with  (5.33b)  and  (5.33c)  we  find 
that 

Vt  =  2(fci,ylCl  (5.37) 

This  result  also  holds  for  the  transceiver  of  Figure  5.2,  if  in  addition  to  td  —  0 
and  5e\  =  0,  we  have  H  +  fcK  =  0.  Regarding  Blt ,  we  can  verify  that  this  matrix 
satisfies  h \dB\  =  0  for  every  t  ^  0.  This  follows  from  the  fact  that  xfl  does  not 
depend  on  the  control  vector  u\. 

The  space-time  rates  (5.20)  can  be  expressed  in  terms  of  x\  and  z\  as 


_  Tb 
2  || 


l(r  -  CX)  +  A 


A?  (<■)  =  <-«,  exp  (-i 
A?  (r)  =  exp  (-1  ||if_t^f-.J|2)  7 (r  -  Cf4)  +  A. 


(5.38) 


We  allow  {rf}  and  to  be  nonnegative  stochastic  processes  with  piecewise 
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continuous  and  bounded  sample  paths  and  nonzero  mean1.  Further,  we  assume 
that  {/'“j  and  are  mutually  independent  and  are  independent  of  xl0  and 

{u>t} ,  i  =  a,  b. 


i 


This  characterizes  the  effect  of  random  optical  fade  and  data  modulation. 
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Chapter  6 


Cooperative  Optical  Beam  Tracking:  Optimal  Control 

6.1  Introduction 

In  Chapter  5,  we  discussed  the  concept  of  cooperative  optical  beam  tracking  and 
developed  a  mathematical  model  for  this  alignment  scheme.  In  the  next  step,  the 
dynamical  equations  associated  with  this  model  were  linearized  around  a  nominal 
trajectory  and  a  stochastic  description  for  the  disturbance  vectors  was  introduced. 
In  the  present  chapter,  we  use  this  linearized  stochastic  model  in  order  to  study 
the  problem  of  controller  design  for  a  cooperative  optical  beam  tracking  system. 
The  design  goal  is  to  maximize  the  flow  of  optical  energy  between  the  stations. 
We  shall  study  the  problem  (separately)  for  two  scenarios:  short  range  applications 
with  td  —  0  and  long  range  applications  with  t(j  >  0. 

The  organization  of  this  chapter  is  as  follows.  In  Section  6.2,  we  first  introduce 
some  additional  assumptions  on  the  model  of  Chapter  5  and  modify  it  accordingly, 
and  then  define  its  associated  control  problem.  Section  6.3  considers  an  associated 
estimation  problem  and  presents  an  approximate  solution  for  it.  This  solution  will 
be  used  in  Sections  6.4  and  6.5  in  order  to  develop  two  different  methods  for  solving 


141 


the  control  problem  for  t,i  —  0  and  tj  >  0,  respectively. 


6.2  Model  and  Problem  Statement 

In  order  to  use  the  results  of  Theorem  3.3.2  in  our  analysis,  we  shall  assume  that  7  (r) 
in  (5.38)  has  a  Gaussian  profile  (see  Section  5.3.3)  and  ignore  the  effect  of  the  dark 
current  and  the  background  noises  by  letting  A  =  0  in  (5.38).  In  addition,  we  use  the 
“infinite  resolution”  and  “infinite  area”  model  for  the  photodetectors  as  explained 
in  Section  3.5.  These  assumptions  enable  us  to  describe  the  output  of  a  position- 
sensitive  photodetector  by  a  space-time  point  process  over  M2.  After  solving  the 
control  problem  using  this  idealized  model,  we  can  apply  the  approximation  method 
of  Section  3.5  to  modify  the  solution  for  a  practical  finite  resolution  photodetector. 

6.2.1  The  Model 

Following  the  model  of  Section  5.5,  we  describe  the  dynamics  of  station  i  =  a,  b  by 
the  state-space  equation 


dx\  =  A\x\dt  +  Bytdt  +  D\dw\  (6.1) 

where  x\  €  Rn  is  the  state  vector  and  {wlt}  is  a  p-dimensional  standard  Wiener 
process.  The  initial  state  xz0  is  independent  of  {«;“}  and  {w^ }  and  is  assumed  to  be 
a  Gaussian  vector  with  mean  xl0  =  0  and  covariance  matrix  Pq.  We  remind  from  Sec¬ 
tion  5.5  that  Xq,  i  =  a,  b  and  {wlt}  ,  i  =  a,  b  satisfy  the  linear  constraints  (5.36).  The 
control  vector  u\  will  be  discussed  separately  for  G  =  0  and  G  >  0  in  Section  6.2.2. 
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The  observation  at  station  i  —  a,b  which  is  provided  over  M2  is  the  space-time 


point  process  Nl  (T  x  S)  with  rate 


A’(r,4K)  =V,\Sh(r-,Clx\,R) 


(6,2) 


where  R  =  2  ( fc/kg )2  hx2  is  defined  in  Section  5.3.3  and  the  Gaussian  map  $2  (•)  is 
given  by  (3.5).  The  stochastic  processes  and  {/4}  are  defined  as 


K  =  4«,exp(-i||LLld4-jr) 

4  =  "1-1,  a<P  (-5 


(6.3) 


where  the  nonnegative  stochastic  process  {up  is  statistically  independent  of  xl0 
and  {w^} ,  i  =  a,  b  and  has  piecewise  continuous  and  bounded  sample  paths  and 
nonzero  mean.  The  stochastic  vector  z\  is  determined  in  terms  of  the  state  vector  x\ 
according  to1 

4  =  M  +  n  d4+u-  («•«) 


We  shall  assume  that  prior  to  t  =  0,  the  stations  are  under  the  open-loop 
control  Ut=u\  =  0.  This  implies  that  for  t  G  [ — 0] ,  x\  is  a  zero-mean  Gaussian 
random  vector  with  covariance  matrix  P{  which  satisfies  the  matrix  differential  equa¬ 
tion 

Plt  =  A\Pi  +  Pi  A?  +  DID f . 


Note  that  due  to  the  propagation  delay  td,  the  initial  state  of  the  system  must  be 


1See  (5.35)  in  Section  5.5. 
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given  over  the  interval  t  G  [—£<*,  0],  instead  of  a  single  point  t  —  0. 

Over  the  underlying  probability  space  (fi,  P)  of  the  above  stochastic  model, 
we  dehne  i  —  a,  b  as  the  a- algebra  generated  by  the  space-time  point  process  i 
over  [0,  t).  We  say  and  u\  are  admissible  controls  if  u\  is  ^-measurable  and  the 
solution  to  (6.1)  is  well  dehned  for  i  =  a,  b. 

6.2.2  Problem  Statement 

For  long  range  applications  using  the  transceiver  of  Figure  5.2,  the  control  u\  is  a 
4-dimensional  vector  comprised  of  G  M2  and  wf’*  G  M2  which  are  the  control 
vectors  associated  with  the  tracking  mirror  and  the  point-ahead  mirror,  respectively. 
In  terms  of  w"’*  and  wf’*,  the  state-space  equation  (6.1)  can  be  expressed  as 

dx\  =  A\x\dt  +  B^’lu^’ldt  +  B^,lu^’ldt  +  D\dw\  (6.5) 

where  B and  i?f’*  are  dehned  such  that  B\  =  £>“’*  i?f’*  .  Note  that  the  con¬ 

trol  v°{1  is  employed  for  the  purpose  of  tracking,  i.e.,  maintaining  ||Q^||  close  to  0, 
while  vPt'1  is  used  for  precise  pointing,  i.e.,  keeping  ||L^||  as  small  as  possible.  We 
observe  from  (5.30)  and  (5.31)  that  C\x\  does  not  depend  on  uf’*,  thus  the  tracking 
control  Uf  '1  can  be  designed  subject  to  (6.5)  with  uf’*  =  0.  The  design  problem  can 
be  formulated  in  terms  of  minimizing  the  cost  functional  (3.6)  with  Qt  =  C)'TC') , 
Pt  =  77/2x2,  77  >  0,  and  S  =  0  as  explained  in  Section  3.5.  A  simpler  scheme  is  to  ob¬ 
tain  uf’1,  such  that  E  [C\x\ \&l]  =  0.  This  scheme  needs  some  additional  assumptions 
which  will  be  explained  later  through  Lemma  6.4.1. 
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After  determining  an  admissible  we  solve  the  pointing  control  problem 
subject  to  (6.5)  with  the  already  obtained  For  the  nonnegative  constants  ka 
and  kb  and  the  fixed  time  horizon  T  >  td  ^  0,  we  define  the  objective  functional 


J  =  E 


LJo 


kau1  exp  ( — 


L\zt 


+  kbisbex p  (- 


T  a  ~a 


dt 


(6.6) 


as  a  linear  combination  of  the  expected  optical  energy  received  by  the  stations  during 
[td,T  +  td}-  Then,  the  pointing  control  problem  can  be  defined  as  follows.  Subject 
to  (6.5),  determine  the  admissible  controls  {nf’“,  t  G  [0,  T]}  and  {nf’6,  t  G  [0,  T]} 
that  maximize  the  objective  functional  (6.6). 

For  short  range  applications  with  td  —  0,  the  control  vector  u\  G  IR2  is  associ¬ 
ated  with  either  the  pointing  assembly  or  the  tracking  mirror.  In  this  case,  the  con¬ 
trol  problem  is  to  obtain  the  admissible  controls  {  t  G  [0,  T]  j  and  {ub,  t  G  [0,  T]} 
that  maximize  the  objective  functional 


J 


E 


£;az/taexp 


(6.7) 


We  note  that  for  the  case  of  td  =  0,  (6.3)  can  be  simplified  as 

=  >?exp  (-5  IgWlf) 

IJ-i.  =  "t  exP  (“j  ||W|f)  ■ 


(6.8) 


145 


6.3  Estimation  Problem 


The  first  step  in  solving  the  control  problem  is  to  determine  the  posterior  den¬ 
sity  pxi  (x\3§l).  While  the  structure  of  the  model  for  a  single  station1  is  similar  to 
the  model  of  Chapter  3,  this  model  does  not  completely  satisfy  the  assumptions 
of  Theorem  3.3.2,  due  to  the  statistical  dependence  between  {p\}  and  {icj}.  This 
dependence  is  manifested  by  (6.3)  which  indicates  that  {pi}  depends  on  the  state 
of  the  other  station,  and  as  a  consequence,  the  statistical  dependence  between  {x^} 
and  {x\}  leads  to  statistical  dependence  between  {pi}  and  {w\}.  The  dependence 
between  {x^}  and  {xbt}  has  two  origins:  the  linear  constraints  (5.36),  and  the  de¬ 
pendence  of  {xlt}  on  through  u\,  noting  that  depends  on  the  state  of  the 
other  station. 

The  discussion  above  suggests  that  the  coupling  between  the  stations  must 
be  involved  in  an  exact  solution  for  the  estimation  problem.  Such  an  estimation 
problem  is  difficult  to  solve,  not  only  due  to  the  complexity  of  the  model,  but  also 
because  of  the  requirement  of  determining  the  optimal  controls  u\  and  ub,  prior  to 
solving  the  estimation  problem.  A  suboptimal  solution  for  the  estimation  problem 
which  avoids  these  difficulties  can  be  obtained  by  applying  Theorem  3.3.2  to  each 
individual  station,  ignoring  the  statistical  dependence  between  {p\}  and  {wl}.  To 
justify  this  approximation,  we  note  that  for  a  well-designed  system,  the  expected 


1This  model  is  consisted  of  the  state-space  equation  (6.1)  and  the  observation  of  the  space-time 
point  process  with  rate  (6.2). 
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value  of  the  stochastic  process 


^exp(4||LK|j2) 

must  stay  close  to  1  for  every  t  G  [0,T].  This  implies  that  the  standard  deviation 
of  p\  is  small  compared  to  its  expected  value,  i.e.,  we  can  approximate 

A  -  E  [ri]  ■  (6.9) 

Based  on  (6.9),  we  approximate  the  stochastic  processes  {pi}  and  {p\}  by 
R-t„  E  M  }  and  {  Vt-td E  [Pt]}j  respectively.  Since  {v\}  is  statistically  independent 
of  Xq  and  {wj},  these  approximations  imply  that  {p\}  is  statistically  independent 
of  x1q  and  {wj}.  Thus,  using  Theorem  3.3.2,  we  approximate 

Px\  (xWt)  -  Pxi  [xWt]  =  { x ;  x\,  EJ)  (6.10) 

where  x\  and  E)  are  the  solutions  of  the  stochastic  differential  equations 

dx\  =  A\x\dt  +  B\u\dt  +  f  Ml  (r  —  Cltx\)  Nl  (dt  x  dr)  (6.11a) 

J  K2 

dE*  =  A\Yi\dt  +  E \A*Tdt  +  D\D?dt  -  M\C\Y?tdN\  (6.11b) 

with  the  initial  states  xlQ  =  Xq  =  0  and  Eg  =  Pq.  Here,  the  stochastic  process  { 
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and  the  stochastic  matrix  Mt*  are  defined  as 

Nl  =  TV*  ([0,  t)  x  M2) 

mi  =  Ejcf  (c;e;cF  +  a)-1  . 

From  the  definition  of  Nl  (T  x  S),  we  know  that  is  a  doubly  stochastic  Poisson 
process  with  rate  {/ilt}  which  is  defined  by  (6.3)  (or  by  (6.8)  for  the  short  range 
applications). 

Note  that  the  estimator  (6.11)  does  not  explicitly  depend  on  {n\,t  ^  0}; 
however,  the  estimates  x\  and  E)  depend  on  {n\,t  ^  0}  through  the  space-time 
point  process.  This  dependence  can  be  explained  by  observing  from  (6.11b)  that 
the  occurrence  of  each  event  in  the  space-time  point  process  decreases  Y,\  by  the 
positive  definite  matrix  Mf*Ct*E).  Thus,  a  larger  p\  leads  to  a  smaller  estimation 
error  by  increasing  the  occurrence  rate  of  the  events.  According  to  (6.3),  a  smaller 
pointing  error  \\L\_t  ,z\_td\\  at  station  b  results  in  a  larger  and,  as  a  consequence, 
a  closer  estimate  for  which  in  turn,  leads  to  a  smaller  pointing  error  at  station  a. 
This  explains  the  mechanism  which  couples  the  dynamics  of  the  stations. 

The  posterior  density  pzi  {z\Mlt)  can  be  obtained  in  terms  of  pxi  (x\3&l)  as 
stated  in  the  following  theorem. 

Theorem  6.3.1.  Consider  the  state-space  equation  (6.1)  and  its  associated  space- 
time  observation  with  the  rate  process  (6.2).  Assume  that  the  increasing  family  of 
a- algebras  SS\  are  given  and  that  u\  is  ^-measurable.  Then,  for  z\  defined  by  (6.4), 
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the  posterior  density  pzi  (z is  given  by 


Pxj  PI &!) ( -  H,X,  V,)  dx 


(6.12) 


where  Ht  and  Vt  are  dehned  as1 

Ht  =  n  +  nd$A'  (t  +  td,t) 

and 

Vt  =  Jt+td  (nd$^  (t  +  td,T)D^  (nd$Al  {t  +  td,r)D\)T dr. 

Here,  &A‘  (t,  r)  is  the  transition  matrix  associated  with  A\. 

Proof.  We  know  from  Section  5.5  that  YYlD't  =  0  for  every  t  ^  0.  Thus,  solving  the 
linear  stochastic  differential  equation  (6.1)  from  t  to  t  +  td,  we  get 

Tidx\+td  =  nd$Ai  (t  +  td,  t )  x\  +  n  dv\ 


where 


(t  +  td,  t)  DlTdwlT. 


From  the  definition  of  z\  and  Ht  we  can  write 


4  =  H,x\  +  Udv‘t. 


1We  know  from  Section  5.5  that  Ht  and  Vt  are  identical  for  stations  a  and  b,  thus  the  super¬ 
script  i  has  been  dropped. 
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The  conditional  characteristic  function  of  z't  given  is  given  by 


E  [exp  (. jujTz\ )  \3§l\  —  E  [exp  (jujTHtx\)  exp  (juTUdvlt) 

=  E  [exp  (jujTHtx\)  | &l]  E  [exp  (jujTUdv))) 
=  E  [exp  {jujT Htx\)  | exp  (— ^uTVtu)  . 


Taking  the  inverse  Fourier  transform  of  the  last  expression  above,  we  obtain  (6.12). 

□ 

Using  Theorem  6.3.1  and  Lemma  4.5.1,  we  determine  the  approximation  of 
pzi  (z\&%)  associated  with  (6.10)  as 

P,j  (zm)  =  »,  (*; 4,  +  V,)  (6.13) 


where  z\  is  defined  as  z\  =  Htx\.  For  future  reference,  we  use  (6.13)  to  obtain 


P,j  (z\ai)  exp  (-1  llijzll2)  dz  =  fi  (Ej)  exp  (-1  \\Q\L\zl\\2)  (6.14) 


where  /, 


is  defined  as 


fl  (. X )  =  [det  (I2x2  +  L\  {HtXHj  +  Vt)  L?)~ 


iT\l  ~1/2 


(6.15) 
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and  the  positive  definite  stochastic  matrix  Q\  is  given  by1 


Q‘,  =  (hyi  +  L‘  +  V,)  Lf)~V 2  .  (6.16) 


For  the  case  of  td  =  0,  in  which  Ht  =  Inxn  and  Vt  =  0,  (6.15)  and  (6.16)  are 
simplified  as 

f,  (X)  =  [  del  (I2x2  +  LjXLf )]  “1/2  (6.17) 

and 

q\  =  (i2*2  +  y£;£;T1/2-  (6-i8) 

6.4  Control  Problem:  Short  Range  Applications 

We  exploit  the  estimator  (6.11)  and  the  approximate  posterior  density  (6.10)  in  order 
to  prove  Theorem  6.4.1  below  which  is  the  basis  for  developing  a  suboptimal  control 
for  the  case  of  td  =  0.  Before  stating  the  theorem,  we  fix  notation.  Let  gt  (Sa,  £b) 
be  a  scalar  function  of  n  x  n  symmetric  matrices  Sa  and  X6.  Assume  that  the 
partial  derivatives  of  gt  (Xa,  with  respect  to  the  elements  of  Ea  and  X6  exist. 
We  denote  by  dgt  (Xa,  Xb)  /dT,1,  i  —  a,b  a  n  x  n  symmetric  matrix  with  diagonal 
elements  dgt/dalkk  and  off-diagonal  elements  (1/2)  dgt/dalkl,  where  akl  is  the  element 
of  X*  at  kth  row  and  Ith  column.  We  define  the  linear  operators  £“  {•}  and  C\  {•}  as 


£at  { 9t  (s°,  Sfc) }  =  gt  (St  (X“) ,  Xb)  -  gt  (Xa,  Xb) 
C\  {gt  (Xa,  Xfe) }  =  gt  (Xa,  St  (X6))  -  gt  (Xa,  Xb) 
1Here,  by  X  =  T-1/2,  we  mean  a  matrix  X  that  satisfies  XT X  =  Y~x . 


(6.19) 
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where  .S')  (•)  is  given  by 


Slt  (£)=£  —  ECf  (Q£Cf  +  i?)  1Q£.  (6.20) 


Theorem  6.4.1.  Fix  sample  paths  and  f  e  [0,T].  Let  £“  and  be  nxn 
symmetric  matrices  and  assume  that  gtCSa,T,b),  t  e  [0,  T]  is  the  solution  of  the 
partial  differential  equation 


d9t  {ka  +  A“  {gt  (£“,  £b)  } 


+  i#“(£a)  (*>  +  £*{<&  (£“,£b)} 

f%(£a,£b)  /  ,  „0/|a. 


+  ^a^aT  +  panaT 


tr  {  —  Sl,)  (^  E1  +  £  bA?  +  DbtDbT) 


)  |  =  0  (6.21) 


with  the  boundary  condition  gr  (•,■•)  =  0,  where  /)  (•)  is  defined  by  (6.17).  Let 
{£),  t  G  [0,  T]},  i  =  a,  6  be  the  solution  of  the  stochastic  differential  equation  (6.11b) 
with  the  initial  state  Pq.  Then,  for  the  fixed  sample  paths  uf  and  z/b,  the  objective 
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functional  (6.7)  can  be  expressed  as 


J  =  9o  {P0a,  Po) 

4# (k°  +  Cl  {g,  (E? ,  E?) })  +  ubts: (kb  +  4  {a  (Ef ,  Ef) } 
h7f(Ef)  (fc“  +  4  {ft  (E?,  E») })  {l  -  exp  (-i 
b/,a(E?)  b*  +  4  {ft  (4,  E{) })  {l  —  exp  (~\  II Qim 


+  [  B 
Jo 

-fE 

-fE 


dt 


b  2 
t 


dt 

dt 

(6.22) 


where  Q\,  i  =  a,  6  is  given  by  (6.18)  and  the  error  term  Slt,  i  =  a,  b  is  defined  as 


SI  =  exp  \\L\x\\\2^  -  ft  (SJ)  exp  \\Q\L\x\\\ 2)  .  (6.23) 

Moreover,  if  £“  and  are  positive  definite,  £“  [r/t  (Sa,  }  and  C\  {gt  (£“,  j 
are  nonnegative  for  every  t  E  [0,  T\. 

Proof.  See  Section  6.6.  □ 

In  the  following,  we  use  the  results  of  Theorem  6.4.1  to  develop  a  suboptimal 
solution  for  the  control  problem.  Clearly,  the  first  term  on  the  right  side  of  (6.22) 
is  not  involved  in  the  optimization,  since  it  does  not  depend  on  vf  and  u\.  Even 
though  the  hard-to-compute  error  terms  5“  and  S \  do  depend  on  uf  and  u\,  they 
are  small,  at  least  under  the  suboptimal  control  that  will  be  obtained.  Therefore, 
in  maximizing  (6.22),  we  ignore  the  second  term  on  the  right  side  and  maximize  J 
which  is  defined  as  the  sum  of  the  third  and  the  fourth  terms. 

Assuming  that  kl  >  0,  i  =  a,  b,  for  the  fixed  sample  paths  uf  and  z/t6,  we 
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have  J  ^  0,  with  equality  holds  if  and  only  if  =  0  for  almost  every  t  G 

{r  |  vbT  ^  0,  r  G  [0,  T]}  and  Ljxj  =  0  for  almost  every  t  G  (r  1 1/“  7^  0,  r  G  [0,  T]}. 
Assuming  that  the  stochastic  processes  { }  and  have  nonzero  sample  paths 
during  t  G  [0,  T],  the  condition  J  =  0  holds  for  all  sample  paths  of  and 
if  and  only  if  for  almost  every  t  G  [0,  T]  we  have 

Latxat  =  L\x\  =  0.  (6.24) 


Note  that  this  condition  is  sufficient  for  J  =  0,  even  if  the  assumption  above  does 
not  hold,  i.e.,  with  a  nonzero  probability,  some  of  the  sample  paths  of  and  { yf } 
are  identically  zero  over  an  interval  J?  C  [0,  T]. 

Assuming  that  under  the  condition  (6.24),  the  error  term  (second  term)  on  the 
right  side  of  (6.22)  is  ignorable,  for  fixed  sample  paths  z/“  and  v\,  the  maximum  of  J 
can  be  approximated  by  J*  ~  go(Po ,  Pq)-  In  order  to  generalize  this  result  to  the 
stochastic  processes  and  {^6},  we  need  to  average  go(Po ,  Pq)  over  all  sample 
paths  of  and  { i/j } .  For  this  purpose,  consider  (6.21)  with  the  sample  path 
replaced  with  the  stochastic  process  { i/t } ,  i  =  a,  b  and  let  the  random  variable 
.9o(^o\  Po)  be  the  solution  of  this  stochastic  differential  equation  at  t  —  0.  Then, 
the  maximum  of  J  can  be  approximated  as 


V  =  E[g0  (/»/*)]. 


Remark  6.4.1.  The  condition  (6.24)  which  leads  to  the  maximum  of  J  does  not 
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depend  on  ka  and  kb.  In  particular,  the  sufficient  condition  is  the  same  for  ( ka ,  kb )  = 
(1,0)  and  (ka,kb)  =  (0,1),  which  means  that  under  (6.24),  both  stations  approxi¬ 
mately  receive  the  maximum  possible  optical  energy.  In  other  words,  the  stations 
are  not  required  to  pay  any  cost  in  order  to  increase  the  payoff  of  the  other  station, 
which  indicates  a  cooperative  relationship  between  them. 

Remark  6.4.2.  According  to  (6.11),  and  x\  do  not  explicitly  depend  on  { zv “ } 
and  {vb},  thus  the  control  law  which  leads  to  (6.24)  does  not  explicitly  depend 
on  and  { } .  This  is  important  in  particular  when  a  reliable  model  for 

does  not  exist. 

The  following  lemma  determines  a  control  law  ult  which  achieves  (6.24). 

Lemma  6.4.1.  Consider  the  stochastic  differential  equation  (6.11a)  and  assume 
that  Tq^o  =  0.  Let  L\Blt  be  nonsingular  and  L\  be  differentiable  for  t  ^  0.  Then, 
under  the  control 

n\dt  =  -  (Lis;)-1 1  (l\4  +  i;)  t\dt  +  j  l\mi  (r  -  cjij)  tr  ( dt  x  *■)}  (6.25) 

we  have  L\x\  =  0  for  every  t  ^  0. 

Proof.  We  verify  the  validity  of  the  lemma  by  substituting  (6.25)  into  (6.11a)  and 
left  multiplying  both  sides  by  L\.  The  resulting  equation  will  be  L\dx\  =  — L\x\dt , 
which  yields  d  ( L\x\ )  =  0.  Then  we  argue  that  L\xlt  =  Ll0xl0  =  0  for  t  ^  0.  □ 

Remark  6.4.3.  The  state-space  equation  (5.32c)  indicates  that  the  elements  of  x\ 
which  are  associated  with  e\  constitute  an  isolated  block  in  (6.1).  Also,  according 
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to  (5.31),  this  block  of  x\  does  not  appear  in  C\x\,  i.e.,  its  corresponding  block 
in  C\  is  identically  zero.  Considering  these  facts,  we  find  from  (6.11)  that  the  block 
of  x\  which  is  associated  with  e\  is  identically  zero.  This  result  together  with  (5.37) 
indicate1  that  L\x\  =  0  implies  C\x\  =  0,  which  allows  us  to  simplify  (6.25)  as 

u]dt  =  -  |  (l;a;  +  i;)  x \dt  +  l\m\  r n<  (dt  x  *•)} .  (6.26) 

Remark  6.4.4.  As  mentioned  in  Remark  6.4.3,  the  estimate  of  e\  generated  by  (6.11) 
is  identically  zero.  As  a  consequence,  under  control  (6.26),  the  effect  of  e\  remains 
uncompensated.  Since  can  be  observed  only  by  station  b ,  the  only  possibility 
to  compensate  it  is  to  generate  its  estimate  by  station  b  and  send  this  estimate  to 
station  a  through  the  optical  channel  or  an  independent  RF  link. 


6.5  Control  Problem:  Long  Range  Applications 

In  this  section,  we  develop  a  method  for  solving  the  control  problem  when  t,()  >  0.  In 
this  method,  we  first  establish  an  analytically  tractable  lower  bound  on  the  objective 
functional  (6.6)  and  then  maximize  that  lower  bound.  We  shall  continue  using  the 
approximation  (6.9)  which  leads  to  /ij  ~  /ij,  where  ft)  is  defined  as 


&  =  "It,  E 

ft  =  A-td  e 


exp  (-|  \\LUA-tX) 


(6.27) 


1For  the  transceiver  of  Figure  5.2,  also  the  condition  H  +  fcK  =  0  is  required. 
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In  the  remainder  of  this  section,  we  consider  an  “approximate  model”  which  is 
similar  to  the  model  of  Section  6.2.1,  except  for  fx\  being  replaced  with  filt,  i  —  a,b. 
This  approximate  model  allows  us  to  present  our  results  in  exact  statements,  since  its 
associated  estimation  problem  has  an  exact  solution.  Then,  these  “exact  results” 
are  interpreted  as  “approximate  results”  for  the  original  model  of  Section  6.2.1. 
Noting  that  {fit}  is  statistically  independent  of  {w£}  and  Xq,  the  exact  solution 
of  the  estimation  problem  associated  with  the  approximate  model  is  given  by  the 
posterior  densities  (6.10)  and  (6.13). 

Our  solution  to  the  control  problem  is  based  on  Theorem  6.5.1  which  estab¬ 
lishes  a  lower  bound  on  the  objective  functional  J,  and  Corollary  6.5.1  which  deter¬ 
mines  a  sufficient  condition  for  the  lower  bound  to  achieve  its  maximum.  In  order 
to  prove  Theorem  6.5.1  and  Corollary  6.5.1,  we  need  the  results  of  Lemmas  6.5.1 
and  6.5.2  below. 

Lemma  6.5.1.  The  scalar  function  f (X)  defined  by  (6.15)  is  decreasing  in  A", 
i.e.,  0  ^  A"1  ^  X 2  (in  the  sense  of  positive  semidefiniteness  ordering)  implies  that 
fl  (A"1)  ^  f\  (A"2)  ^  0.  Furthermore,  for  any  positive  definite  random  matrix  X  we 
have 

E[/,‘(X)]  >/,‘(E[X]).  (6.28) 

Proof.  The  decreasing  property  of  f}  (•)  follows  from  the  increasing  property  of  det  (•) 
over  the  set  of  positive  semideffiiite  matrices,  which  is  shown  in  [53].  To  prove  (6.28), 
let  us  denote  the  argument  of  the  determinant  in  (6.15)  by  Y.  Since  (det  (-))-1  is 
convex  over  the  set  of  positive  definite  matrices  [53],  we  use  Jensen’s  inequality  to 
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show 


E  [fi  (X)]  =  E  (detr1/2)  1  ^  (detE  [y1/2])  1. 

Then  we  exploit  the  matrix  inequality  E  [E]  ^  (E  [V1/2])"  and  the  increasing  prop¬ 
erty  of  det  (•)  to  prove  (6.28).  □ 

The  following  lemma  compares  the  solutions  of  two  generalized  matrix  Riccati 
differential  equations  [54], 

Lemma  6.5.2.  Consider  the  symmetric  matrix  differential  equations 

Y,1  =  AtV/  +  Y/AJ-  +  DtDj  -  a]Yt'Cj  {C,Y?Cj  +  R)  CtYt' 

Y, ?  =  AtY ,2  +  Y?AJ  +  D,Df-  a ?Y?Cf  (C.lfCf  +  R) C,Yt 2 

and  assume  that  the  scalar  functions  a }  and  of  satisfy  0  ^  of  ^  of  for  t  G  [0,  T\. 
Then,  0  E  Ef  E  Ef  implies  that  0  E  l')1  E  V)2  for  0  E  ^  E  ^  E  T . 

Proof.  The  proof  follows  from  [54,  Theorem  4.5]  by  replacing  the  independent  vari¬ 
able  t  with  —t  and  noting  that 

~DtDf  0  1  >  \DtDf  0 

o  R/of  ^  o  R/at 

□ 

Theorem  6.5.1.  Fix  sample  paths  z/“  and  z/f,  t  G  [— td,T],  Let  ff  (•)  be  given 
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by  (6.15)  and  f3lt  be  defined  such  that  j3\  =  0  for  t  <  0  and 


PI  =  fl  (0, 


1  -E 


exp 


I  ||  Q\L\z\ 


for  t  ^  0,  where  Q\  is  defined  by  (6.16).  Assume  that  T“  and  T£,  t  e  [0,  T]  are  the 
solutions  of  the  matrix  delay  differential  equations 


f  “  =  A“T“  +  TatAf  +  DclDf 

-  <-u  (flu  (rU)  -  0, Lu)TiC?  (CfrjCf  +  n) Cfr?  (6.29a) 

rtb  =  +  r^T  +  p>tfep>tfeT 

-  •'lu  (flu  P-u)  -  ft-u) r‘CtT  (O'Tfey  +  J?) C,X  (6.29b) 

with  the  initial  state  T\  =  P/,  i  =  a,  b  for  t  e  [—£<*,  0].  Then,  T“  and  T^  upper  bound 
E  [E“]  and  E  [E£] ,  respectively,  i.e.  T“  ^  E  [E“]  and  Tbt  ^  E  [E£]  for  t  G  [0,  T\. 
Moreover,  Jl  defined  as 

Jl  =  [  (kXfi  (r?)  +  kVtf‘ (r?))  dt  -  f  +  klubtB1 )  dt  (6.30) 

Jo  Jo 

is  a  lower  bound  for  J,  i.e.  Jl  ^  J  ■ 

Proof.  It  is  shown  in  [20,  Theorem  4]  that  the  solution  of  the  matrix  differential 
equation 


x;  =  a;x;  +  x;a?  +  d;d;t  -  «v;cf  (c;x;c;T  +  R)~'  c;x;  (6.3i) 
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with  the  initial  state  =  P0J  is  an  upper  bound  for  E[E£],  i.e.  E  [EJ]  ^  X\. 
During  t  G  [0,  tcj]-  x\_t  is  a  zero-mean  Gaussian  random  vector  with  covariance 
matrix  Pt6_t  .  It  is  easy  to  show  that  z\_t  is  a  zero- mean  Gaussian  random  vector 
with  the  covariance  matrix  Ht_tdP^_tdH^_t  +  Then,  recalling  that  T^_fd  = 

Pt-td  during  t  G  [0,  p],  we  find  from  (6.27)  that 

K  =  <-,jU  (rt<J  .  <  6  M-  (6.32) 

Substituting  /2“  from  (6.32)  into  (6.31)  with  i  =  a  and  noting  that  @t_td  —  0,  we 
find  that  (6.31)  and  (6.29a)  are  identical  during  t  G  [0,p].  Hence,  we  can  write 
T“  =  ^  E  [E“]  for  t  G  [0,p].  With  a  similar  argument,  we  can  show  that 

r‘  =  X?>E[Ef],ie[0,«J. 

For  t  G  (p,T],  we  use  the  smoothing  property  of  conditional  expectation 
and  (6.14)  to  express  (6.27)  as 

K  =  <E  [e  [exp  (-i  ||L^;f)  \X\ 

=  <E  [/J  (Ej)  exp  (-1  llQjiJiJH2) 

=  <E  [/,"  (Ej)]  -  <E  [/» (E‘)  {l  -  exp  (-1  ||«L‘i‘||2)  }[ 

where,  for  convenience  of  notation,  we  have  replaced  f— p  with  s.  From  Lemma  6.5.1, 
we  have  /s6  (0nXn)  ^  /s6  (S(!)  and  E  [ / (E^)]  ^  /(’  (E  [E^]).  These  two  inequalities 
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together  with  the  definition  of  /3t6  result  in 


at  >  Km  (fU  (E  [Et,J 

We  partition  the  interval  [0,  T]  as 

=  [0,  td],  J 2  =  (' td ,  2  td\,  ...,Sn  =  (ntd  -  td,  T],  (6.33) 

For  lei i  we  already  proved  that  T)  ^  X\  ^  E  [EJ],  Assume  that  T)  ^  X\  ^  E  [E(] 
holds  for  t  G  ,Xk-  Then,  for  t  G  J^+ 1  we  have  /sb  (E  [S3])  ^  /sfe  (T^),  which  results  in 

fit  A  (ft-td  (r/Ud|  ~  A6-td)  •  (6.34) 

In  Lemma  6.5.2,  let  Xf,  T“,  and  ktd  play  the  role  of  Y)1,  Y)2,  and  r,  respectively. 
Then  inequality  (6.34)  and  T^  ^  imply  that  T“  ^  ^  E  [E“]  for  t  G  J^+i- 

A  similar  argument  shows  that  T^  ^  A7"^  ^  E  [E£]  for  t  G  Repeating  this 

process,  we  show  that  T\  ^  X\  ^  E  [E)j  holds  for  t  G  J4,  k  =  1,2, . .  :n,  which 
means  that  T)  ^  E  [E)j  holds  for  every  t  G  [0,  T].  The  second  statement  of  the 
Theorem  follows  from  (6.34).  □ 

Corollary  6.5.1.  Under  the  assumptions  of  Theorem  6.5.1,  let  T)1'1  and  T)b,  t  G 
[0,T]  be  the  solutions  of  (6.29)  with  /3“  =  (3\  =  0,  t  G  [0,T  —  td].  Then, 
defined  as 

J’l  =  /  (rf)  +  kVj?  (rr»  * 

Jo 
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is  an  upper  bound  for  JL,  i.e.  Jf  ^  Jl ,  and  equality  holds  if  for  almost  every  t  E 
[0,  T }  we  have 

=  L\%  =  0.  (6.35) 

Proof.  Referring  to  the  partition  (6.33),  since  /?“  =  =  0  for  t  <  0,  we  have 
F«  _  p*  for  t  E  J^i.  Assume  that  Y'f  ^  T£  for  t  E  J’k.  Then,  from  the  decreasing 
property  of  f\  (•)  and  the  fact  that  ^  0,  we  conclude  that 

•'i-tjU  (r?.J  >  (fU  (rt,J  -  (6-36) 

holds  for  t  E  J^+ 1-  Similar  to  the  proof  of  Theorem  6.5.1,  we  employ  Lemma  6.5.2 
and  inequality  (6.36)  to  show  that  T**  ^  F)  for  t  E  J^+i-  As  before,  repeating 
this  process,  we  show  for  k  =  1,2,  ...n  that  T”  <  T|,  i  6  J’l-  This  means  that 
Yf  ^  T\  holds  for  every  t  E  [0,  T],  Applying  this  inequality  and  (3\  ^  0  to  (6.30),  we 
find  J*L  ^  Jl ■  From  the  definition  of  [if  we  know  that  (6.35)  results  in  (3?  =  $  =  0 
almost  everywhere  in  [0,  T],  which  leads  to  Jf  =  Jl ■  □ 

The  following  lemma  proposes  a  control  law  rtf’*  which  leads  to  L\Htx\  =  0, 
t  ^  0,  or  equivalently  the  condition  (6.35). 

Lemma  6.5.3.  Consider  the  stochastic  differential  equation  (6.11a)  with  B\u\  = 
+  Bt'lv?t'1  and  assume  that  Ll0H0xl0  =  0.  Let  L\HtB f’*  be  nonsingular 
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and  L\Ht  be  differentiable  for  t  ^  0.  Then,  under  control 


L\HtA*  +  x\dt  +  LiHtBf  ufdt 

+  [  UtHtMl  (r  -  Cltx\)  Nl  (dt  x  dr) 

Jr2 

(6.37) 

we  have  L\Htx\  =  0  for  every  t  ^  0. 

Proof.  The  proof  is  similar  to  the  proof  of  Lemma  6.4.1.  □ 

As  mentioned  in  Section  6.2.2,  the  goal  of  the  tracking  control  is  to  keep  C\x\  = 
0,  i  —  a,  b  for  every  t  ^  0.  Suppose  that  CqXq  =  0,  C\Bf'1  is  nonsingular,  and  C\ 
is  differentiable.  Then,  according  to  Lemma  6.4.1,  we  can  achieve  C\x\  =  0,  t  ^  0, 
using  the  control  law 


w 


0,i 


dt  = 


LAH,B?)l 


-l 


ufdt  =  -  (c;Bf ') 


-1 


C\A\  +  C\ 


x\dt  +  Cl  Ml  /  rNl  ( dt  x  dr) 


(6.38) 


Upon  combining  (6.37)  and  (6.38),  we  get 


uldt  =  Fltx\dt  +  G\Ml 


rNl  ( dt  x  dr) 


where  the  matrices  F/'  and  G\  are  given  by 


1 2x2  02x2 

-1 

(c;Br)-\c'tAi  +  ci) 

(  T  i  IT  T  i  JX  Ra>*  T 

{ LtHtBt  )  LtHtDt  l2x  2 

(LiH,B'tUy'(LiH,Ai  +  L‘H,  +  L‘Ht ) 
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and 


I 2x2 

02x2 

-1 

(c*B“-‘ylcj 

I 2x2 

6.6  Proof  of  Theorem  6.4 

Using  the  differential  rule  for  point  processes1  [24],  we  can  write 

<*»(£?,£?)  =  (0ft(E?,E i)/9t)dt 

+  tr  { (dgt  (E?,  Ef)  /3E?)  (.4?E?  +  E -Af  +  D‘D?)  }  dt 
+  tr  { (dgt  (Ef,  Sj)  /3E,6)  +  E.Mg  +  B?  A‘T)  }  * 

+  £?  {*  (E?,  Ef) }  dA?  +  E?  {ft  (E?,  Ef) }  dNl  (6.39) 

Noting  that  {/j,]}  defined  by  (6.8)  is  the  rate  of  {iVt*},  we  use  the  smoothing  property 
of  conditional  expectation  to  show  that 

E  [4  {g,  (Ef,  Ef) }  dNi]  =  E  [E  [C\  {g,  (Ef,  Ef) }  dJV’|M“]] 

=  E[E[4{a(E?,Ef)}rt*K]] 

=  E  [C\  {gt  (Ef,  Efj }  4]  di,  i  =  o,  6. 


1This  is  the  counterpart  of  the  Ito  differential  rule  for  Wiener  process. 
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Taking  expectation  from  both  sides  of  (6.39),  using  the  last  result,  and  noting  that 
E  [dgt( E“,  E*)]  =  dE  [gt{ E“,  E*)] ,  we  get 


)} 


dE[ft(E?,Ef)]  =E  0ft(E?,E i)/dt 

+  tr  { (09i  ( E?,  E?)/0E?)  (r4?E?  +  E?r4f  +  A“£>? 

+  tr  { (09i (E?,  Ej)/0E»)  (a*E?  +  E,Mf  +  DbDbT)  } 
+  A  {ft  (E? ,  EJ)  }  /t?  +  £{  {ft  (Ef ,  E{) }  A I  dt. 


We  add  the  expression  E  [/ca/r“  +  dt  to  the  both  sides  of  this  equation  and 
rearrange  the  terms  in  order  to  obtain 


E [*X  +  *VJ]<tt-  { -dE[ft(E?,E{)] 


=  E 


+  E  [«  -  z£f,k(E‘))  (fc“  +  C\  {ft  (Ef,  £?)}) 
+  E  [(ft»  -  ftVr( £?))  (*"  +  A*  {ft  (s?.  s! 
ag,(^!’S?)  +  (k‘  +  c;  {ffi  (E?,  E{) }) 


vt 


+  tr 


/,“(E{)(^  +  £‘{ft(E?,E{)}) 

0ft(£?,E{) 


0Ef 


kL“E“  +  EM?  +  D“D 


t  t 


't^t 


+  tr  <  —  (hr’  —  ( AbY.b  +  T.bAbT  +  DbDb7" 


0E‘ 


JtILt 


't^t 


dt 

dt 


dt. 
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Since  gt  (•,  •)  is  the  solution  of  (6.21),  the  right  side  of  the  equation  above  is  identi¬ 
cally  zero  which  leads  to 


E  [*X  +  *X]  dt  =  -dE  Ej)] 


+  E 
+  E 


(rf  -  <f! 


W  - 


**  +  q  {a  (£?,  s?) 


(It 

dt. 


Integrating  this  equation  from  0  to  T  and  noting  that  E  [gT  (E^,  Eg,)]  =  0  and 
E  [go  (Eg,  Eg)]  =  g0  (P0a,  P0fe),  we  obtain 


J  = 


E  [kanat  +  kbf4l  dt 


=  go(PS,Po)  + 


E 


E 


X -!/?/*(£?))(*:“ +  £?{«,( Ef.Ej) 
(rf-I,‘/“(E?))(C'+£t»{9t(Ef,E‘) 


dt 

dt.  (6.40) 


From  (6.8)  and  (6.23),  we  have 


dt  -  4f!W)  =  ^  -  ^/,4(S?)  {l  -  exp  (-1  ||Q?L?y f)  } 
dt  ~  =  461  -  {l  -  exp  (-i  ||g?L?i?||2)  }  . 


Substituting  these  expressions  into  (6.40),  we  obtain  (6.22). 

In  order  to  prove  the  second  statement  of  the  theorem,  we  need  the  following 
preliminaries. 

P-1)  In  the  context  of  this  proof,  we  say  /  (•)  :  Mnxn  — »  R  is  decreasing,  if  for  any 
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positive  definite1  E  and  any  positive  semidefinite  A,  we  have  /  (E  +  A)  ^ 
/(E).  Also,  we  say  /  (•)  is  m-nonnegative,  if  for  any  positive  definite  E,  we 
have  /  (E)  ^  0. 

P-2)  If  /)  (•)  and  f2  (•)  are  decreasing  and  m-nonnegative,  /i  (-)+/2  (•)  and  /i  (•)  /2  (•) 
are  decreasing  and  m-nonnegative  as  well. 

P-3)  If  /  (•)  is  decreasing,  for  any  positive  definite  E,  we  have  /  (E)  ^  /  (A)  (S)), 
where  A)  (•)  is  defined  by  (6.20). 

Proof.  Applying  the  matrix  inversion  lemma  to  (6.20),  it  is  easy  to  verify 
that  for  any  positive  definite  E,  SI  (E)  is  positive  definite.  Also,  we  know 
from  (6.20)  that  A  =  E  —  SI  (E)  is  a  positive  semidefinite  matrix.  Since  /  (•) 
is  decreasing,  we  can  write 

/  (E)  =  /  (S,<  (E)  +  A)  </(S‘(E)). 


□ 

P-4)  If  /  (•)  is  decreasing  and  m-nonnegative,  for  any  fixed  f,  f  (SI  (•))  is  decreasing 
and  m-nonnegative. 

Proof.  For  any  positive  definite  E  and  any  positive  semidefinite  A,  we  can 
show 

E"1  -  (E  +  A)-1  =  E_1A1/2  (/  +  A1/2E-1A1/2)_1  A^E"1  A  A  (6.41) 

xBy  definition,  any  positive  definite  matrix  is  symmetric. 
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where  I  is  the  identity  matrix  with  proper  dimension.  This  indicates  that  A 
is  positive  semidehnite.  Using  the  matrix  inversion  lemma  and  replacing  E_1 
with  (E  +  A)-1  +  A,  we  can  write 

Slt  (E  +  A)  -  Slt  (E)  =  ((S  +  A)"1  +  CfR-'Ciy1  -  (S'1  +  CfiT1^)-1 

=  ((s  +  A)-1  +  c^R-'ciy1 

-  (((E  +  A)-1  +  Cf iT'Ct)  +  a)T 

Applying  (6.41)  to  the  last  equality,  we  find  that  SI  (E  +  A )—Sl  (E)  is  positive 
semidefinite.  Then,  since  SI  (E)  is  positive  definite  and  /  (•)  is  decreasing,  we 
have 


/  {St  (E  +  A))  =  /  (St  (E)  +  {St*  (E  +  A)  -  St  (E )})  <  /  (Sj  (E)) 


which  means  that  /  (SJ:  (•))  is  decreasing.  Moreover,  since  /  (•)  is  m-nonnegative 
and  S\  (E)  is  positive  definite,  f  {S\  (•))  is  m-nonnegative.  □ 

P-5)  For  any  fixed  t,  f\  (•)  defined  by  (6.17)  is  decreasing  and  m-nonnegative. 

Proof.  For  any  positive  definite  E  and  any  positive  semidefinite  A,  we  can 
write 


/as)  /  det  (/■ 2X2  +  L‘  SLf  +  i;al;t) 

ft  (s  +  A)  V  det  (J2X2  +  ijELf ) 

—  \j det  (/2x2  +  A 
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where  A)  is  defined  as 

a;  =  ((/2x2  +  i;ELf)"1/2  u)  A  ((/2X2  +  r1/2  L\)T . 

This  implies  that  det  (/2X2  +  A))  ^  1,  which  leads  to  f\  (E  +  A)  ^  /)'  (E).  □ 

P-6)  Let  ht(Ea,  Eb)  be  a  scalar  function  of  nxn  matrices  Ea  and  Eb.  Assume  that 
this  function  is  decreasing  and  m-nonnegative  in  both  Ea  and  E6.  For  e  ^  0 
define  the  linear  operator  /Q  as 

K-‘M E“.E‘)  =  (l-</,*(Et)  -6^/ta(S“))A.(Ar(S0),A:,l’,‘(E'’)) 

where 

xr  (E)  =  E  +  e  (A\ E  +  EAf  +  . 

Then,  for  any  positive  definite  Ea  and  Efc  and  any  positive  semidefinite  Aa 
and  A6,  there  exists  C  =  C  (Ea,  S b,  Aa,  Afc)  >  0  such  that  for  every  0  ^  e  <  (, 
we  have 

/qht(E“  +  Aa,  Eb)  ^  JCetht(Y,a,  Eb) 

/Q/p(E“,  Eb  +  Ab)  ^  K.etht{ Ea,  Eb) 

/Qht(Ea,Eb)  ^  0. 

Therefore,  as  e  ^  0+,  these  conditions  are  satisfied  for  any  choice  of  Ea,  Efe, 
Aa,  and  Ab.  This  means  that  /Qht(Ea,Eb)  is  decreasing  and  m-nonnegative 
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in  both  £a  and  £6,  as  e  — »  0+. 

We  claim  that  gt(Ua,  £6),  the  solution  of  equation  (6.21)  with  the  boundary  condi¬ 
tion  g-r(£a,  =  0,  is  decreasing  in  both  £a  and  £b  for  every  t  E  [0,  T\.  Once  the 

claim  is  proven,  we  apply  (P-3)  to  (6.19)  in  order  to  show  that  Clt  { gt  (£a,  £b) }  ^  0, 
i  =  a,  b  for  every  positive  definite  matrices  £“  and  £b  and  every  t  E  [0,  T\. 

In  order  to  prove  this  claim,  for  any  0  ^  t  <  T,  we  partition  the  interval 
[t,T]  into  K  subintervals  [tk+i,tk),  k  =  0, 1, . . . ,  K  —  1,  where  tj<  —  t,  to  —  T , 
and  tk  —  tk+\  =  €k  >  0.  Using  this  partition,  we  discretise  the  partial  differential 
equation  (6.21)  to  obtain  the  recursive  equation 

E‘)  =  +  ^</i(s“)) 

+  «*(</.*  (Sft  (E“) .  E*)  +  (E“)9tl  (£“,  Si  (E*))) 

+  /CJ9t>(E“,Et)+0(4).  (6.42) 

Starting  from  gto  (•,  •)  =  0  and  using  this  recursive  equation  for  k  —  0, 1,  2, ... ,  K  —  1, 
we  can  determine  gtK  (•,•)•  Then,  by  letting  K  — >  cxd  such  that  max — >  0,  we  have 
9tn  (■;  ■)  *gt  (■;  ■)• 

We  prove  by  induction  that  as  K  — >  oo  and  max  — >  0,  for  k  =  0, 1, . . . ,  K, 

gtk  (•,  •)  is  decreasing  and  m-nonnegative  in  both  £a  and  £fe.  Clearly,  gto  (£“,  £fe)  =  0 
is  decreasing  and  m-nonnegative  in  both  £a  and  £fe.  Also,  we  show  that  if  gtk  (£“, 
is  decreasing  and  m-nonnegative,  gtk+1  (£a,  £fe)  is  decreasing  and  m-nonnegative  as 
well.  For  this  purpose,  we  use  (P-2,  P-5)  and  (P-2,  P-4,  P-5),  respectively,  to 
show  that  the  first  and  the  second  terms  on  the  right  side  of  (6.42)  are  decreasing 
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and  m-nonnegative.  Also,  as  e*,  — >  0+,  (P-6)  implies  that  the  third  term  on  the 
right  side  of  (6.42)  is  decreasing  and  m-nonnegative.  Since  all  three  terms  on  the 
right  side  of  (6.42)  are  decreasing  and  m-nonnegative,  we  conclude  from  (P-2)  that 
gtk+1  (£“,  E6)  is  decreasing  and  m-nonnegative. 
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Chapter  7 


Conclusion  and  Directions  for  Future  Work 

7.1  Summary  of  Main  Contributions 

This  dissertation  is  devoted  to  finding  solutions  for  two  major  concerns  in  free-space 
optics:  digital  communication  over  a  free-space  optical  channel  and  optical  align¬ 
ment  between  the  transmitter  and  the  receiver.  Adopting  a  stochastic  approach, 
we  formulated  these  concerns  in  terms  of  detection,  estimation,  and  optimal  control 
problems  with  point  process  observations.  This  observation  model  is  imposed  by  the 
nature  of  optical  sensors  which  are  the  essential  component  of  a  free-space  optical 
link. 

We  discussed  an  M-ary  detection  problem  associated  with  a  marked  and  fil¬ 
tered  Poisson  process  in  additive  white  Gaussian  noise.  The  motivation  for  this 
study  comes  from  the  digital  communication  over  the  optical  channels  (fiber  or  free- 
space).  The  stochastic  model  adopted  for  the  problem  is  adequate  for  characterizing 
the  optical  sensors  (including  the  avalanche  gain),  the  thermal  noise  generated  by 
the  amplifying  circuits,  and  the  atmosphere-induced  optical  fade.  We  obtained  a 
solution  for  the  problem  in  terms  of  an  infinite  sum  of  multiple  integrals,  which  is 
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hard  to  express  in  an  explicit  form.  We  established  two  sets  of  upper  and  lower 
bounds  on  this  infinite  sum,  where  the  bounds  in  the  first  set  are  given  in  explicit 
forms,  while  in  the  second  set,  two  integral  equations  determine  the  bounds.  In 
both  cases,  it  was  observed  that  under  certain  conditions,  the  lower  bound  is  close 
to  the  upper  bound,  which  is  a  motivation  for  approximating  the  solution  by  one  of 
these  bounds.  In  another  effort  for  simplifying  the  infinite  sum,  we  expressed  it  in 
terms  of  an  expectation  taken  with  respect  to  a  stochastic  process.  The  resulting 
expression  was  our  point  of  departure  to  develop  several  approximations  with  dif¬ 
ferent  levels  of  complexity.  These  approximations  can  be  implemented  by  means  of 
finite-dimensional,  nonlinear,  causal  filters. 

A  stochastic  dynamical  model  introduced  in  [20]  plays  a  central  role  in  our 
study  of  optical  alignment.  This  model  consists  of  a  linear  stochastic  state-space 
equation  driven  by  a  control  vector  and  a  vector-valued  Wiener  process,  and  the 
observation  of  a  space-time  point  process  with  a  rate  which  depends  on  the  state 
vector.  Associated  with  this  model,  an  optimal  control  problem  is  defined  in  [20] 
in  terms  of  minimizing  a  quadratic  cost  functional.  The  model  and  its  associated 
control  problem  have  been  used  in  [19]  in  order  to  analyze  an  optical  beam  track¬ 
ing  system  which  employs  an  infinite  resolution  position-sensitive  photodetector.  In 
that  study,  it  is  assumed  that  the  observation  is  provided  over  M2  and  that  the  rate 
of  the  space-time  point  process  has  a  Gaussian  profile.  Under  these  assumptions,  the 
control  problem  and  its  associated  state  estimation  problem  have  finite- dimensional 
exact  solutions  [20].  We  used  these  solutions  to  develop  a  suboptimal  control  law 
for  an  optical  beam  tracking  system  with  a  finite  resolution  photodetector.  Fur- 
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thermore,  in  an  effort  to  extend  these  results  to  the  case  of  a  non-Gaussian  rate, 
we  demonstrated  that  the  estimation  problem  can  be  formulated  in  terms  of  esti¬ 
mating  the  state  of  a  discrete-time  linear  model  with  an  observation  vector  which 
is  corrupted  by  additive  white  non-Gaussian  noise. 

We  proposed  an  active  pointing  scheme  in  which  the  receiving  station  estimates 
the  center  of  its  incident  optical  beam  by  means  of  a  position-sensitive  photodetec¬ 
tor.  The  transmitter  receives  this  estimate  via  an  independent  communication  link 
and  incorporates  it  to  accurately  aim  at  the  receiving  station.  We  showed  that 
the  stochastic  model  mentioned  above  can  adequately  characterize  this  alignment 
scheme,  but  with  the  observation  which  is  provided  over  a  subset  of  M2  instead  of  M2. 
Regarding  this  modified  model,  we  determined  a  suboptimal  state  estimator  and  a 
suboptimal  control  law.  In  addition,  we  demonstrated  that  our  suboptimal  results 
tend  toward  the  optimal  results  when  the  observation  is  provided  over  the  entire  M2. 

We  studied  the  concept  of  cooperative  optical  beam  tracking  and  developed  a 
detailed  nonlinear  model  for  it.  Next  we  linearized  the  model  around  a  nominal  state 
trajectory  and  presented  a  stochastic  description  for  its  disturbance  vectors.  Associ¬ 
ated  with  this  stochastic  model,  we  considered  an  optimal  control  problem  with  the 
goal  of  maximizing  an  objective  functional  defined  as  the  expected  optical  energy 
received  by  the  stations  of  the  link.  For  short  range  applications  with  negligible 
light  propagation  delay,  we  proposed  a  suboptimal  control  law  which  approximately 
maximizes  the  objective  functional.  We  demonstrated  that  the  proposed  control 
law  does  not  depend  on  the  nature  of  the  optical  fade  or  the  information-bearing 
signals  which  modulate  the  optical  beams.  In  addition,  we  showed  that  the  control 
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law  simultaneously  maximizes  the  received  optical  energy  for  both  stations.  We  also 
addressed  the  considerations  arising  from  light  propagation  delay  in  a  cooperative 
optical  beam  tracking  system.  For  the  case  that  the  propagation  delay  is  significant, 
a  suboptimal  control  law  was  developed  based  on  maximizing  a  lower  bound  on  the 
objective  functional. 

7.2  Directions  for  Future  Work 

In  this  section,  we  sketch  some  directions  for  the  future  research.  We  first  highlight  a 
few  problems  regarding  the  topics  discussed  in  this  dissertation  which  can  complete 
or  extend  our  present  results.  Then,  we  briefly  explain  an  application  of  free-space 
optical  communication  in  space  missions  and  its  associated  problems  which  can  be 
viewed  in  the  framework  of  this  study. 

7.2.1  Extending  the  Present  Results 

In  Chapter  2,  we  developed  several  detection  rules  as  the  solution  to  our  detection 
problem.  In  order  to  evaluate  the  performance  of  each  detector,  it  is  required  to 
obtain  its  associated  probability  of  error.  In  addition,  this  performance  measure  can 
be  used  to  compare  the  effectiveness  of  the  proposed  detection  rules.  In  particular, 
it  is  useful  to  compare  the  performance  of  each  detection  rule  with  a  simple  linear 
detector.  This  comparison  evaluates  the  improvement  of  the  performance  as  a  re¬ 
sult  of  accepting  the  complexity  of  a  nonlinear  detector.  On  the  other  hand,  it  is 
important  to  determine  how  the  probability  of  error  varies  with  the  parameters  of 
the  model.  Considering  the  complexity  of  the  detectors  and  the  observation  process, 
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it  is  difficult  to  obtain  an  analytical  expression  for  the  probability  of  error.  It  seems 
that  the  numerical  methods  such  as  Monte  Carlo  simulation  are  the  most  convincing 
way  to  compute  this  quantity. 

Upon  computing  the  probability  of  error,  we  can  improve  the  performance 
of  the  detection  rules  by  optimizing  their  associated  threshold.  To  explain  this 
idea,  consider  the  detection  rule  (2.7)  for  the  binary  case.  In  this  detection  rule, 
the  threshold  P1/P2  is  determined  in  terms  of  the  prior  probability  of  the  binary 
message;  however,  when  we  replace  the  exact  likelihood  ratio  function  Lb  ( T )  with 
its  approximation,  the  optimal  threshold  is  not  necessarily  p i  /p-2 .  Thus,  to  achieve 
the  best  performance,  we  can  determine  the  probability  of  error  as  a  function  of  an 
unknown  threshold  and  then,  minimize  this  function  with  respect  to  the  threshold. 
This  modification  can  be  easily  extended  to  the  general  M-ary  hypothesis  testing 
problem  whose  optimal  detector  is  given  by  (2.6).  In  this  case,  the  probability  of 
error  must  be  minimized  with  respect  to  Pi,P2,  ■  ■  ■  ,Pm,  subject  to  the  constraints 
Pi  ^  0,  i  =  1,2, ...  ,M  and  pi  +  p2  H - f- Pm  —  1. 

In  Section  3.5,  we  proposed  a  control  law  for  an  optical  beam  tracking  system 
with  a  finite  resolution  photodetector.  This  controller  was  developed  by  applying  an 
approximation  scheme  to  the  results  of  [20]  for  an  infinite  resolution  photodetector. 
Another  approach  is  to  directly  solve  the  estimation  and  control  problems  for  the 
original  finite  resolution  photodetector.  Since  in  this  case,  the  estimation  problem  is 
infinite-dimensional,  we  have  to  approximate  its  solution  using  a  finite- dimensional 
filter.  The  cumulant  matching  method  developed  in  Section  4.3  is  a  possible  scheme 
for  this  approximation.  Following  this  method,  we  can  approximate  the  posterior 
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density  with  a  Gaussian  density  function  whose  mean  an  covariance  matrix  are 
obtained  from  a  set  of  stochastic  differential  equations  driven  by  the  photodetector 
output.  Further,  a  suboptimal  control  law  can  be  determined  by  following  the  proce¬ 
dure  in  Section  4.4.  Only  after  finding  this  controller  and  comparing  its  performance 
with  the  already  developed  controller,  we  can  decide  which  one  better  approximates 
the  solution  of  the  optimal  control  problem. 

In  Chapter  5,  we  mentioned  to  the  possible  application  of  inertial  sensors  in  a 
cooperative  optical  beam  tracking  system.  In  order  to  involve  these  sensors  in  our 
analysis,  we  need  to  include  the  output  of  the  sensors  in  the  observation  set  of  the 
estimators.  Then,  the  estimation  and  control  problems  must  be  solved  again  for  this 
extend  observation  set.  We  note  that  the  output  of  each  inertial  sensor  is  a  noisy 
version  of  an  element  of  the  disturbance  vector.  If  there  are  convincing  indications 
that  the  noise  is  additive  and  Gaussian,  the  results  of  [20]  still  can  be  applied  to  the 
estimation  problem. 

Another  method  for  improving  the  performance  of  a  cooperative  optical  beam 
tracking  system  is  to  exchange  information  between  the  stations.  In  one  scenario, 
station  b  estimates  the  error  vector  and  sends  the  estimate  back  to  station  a. 
Then,  the  controller  of  station  a  employs  this  estimate  to  compensate  for  by 
means  of  the  point-ahead  mirror.  In  another  scenario,  each  station  transmits  its 
photodetector  output  to  the  other  station,  i.e. ,  the  stations  estimate  their  state  from 
the  observation  of  both  stations.  Clearly,  these  schemes  improve  the  performance  of 
the  system  by  providing  more  information  to  the  controllers;  however,  it  is  not  clear 
whether  this  improvement  is  significant  enough  to  justify  the  increased  complexity. 
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This  must  be  evaluated  through  comparing  the  performance  of  the  schemes  by  means 
of  computer  simulations. 

7.2.2  Optical  Communication  for  Space  Missions 

In  recent  years,  the  idea  of  assigning  the  task  of  a  large  satellite  to  a  cluster  of 
cooperating  micro-satellites  has  drawn  attention  to  certain  space  missions  [55,  56]. 
A  possible  application  for  this  idea  is  a  cluster  of  micro-satellites  equipped  with 
small  aperture  antennas,  which  cooperatively  act  as  a  distributed  antenna  with  an 
effective  aperture  size  larger  than  that  can  be  achieved  by  a  single  large  satellite.  In 
addition  to  cost  reduction  [4] ,  a  cluster  of  micro-satellites  flying  in  formation  has  the 
advantage  of  being  reconflgurable  to  meet  requirements  for  different  missions  [56] . 

In  a  multisatellite  application,  a  closed-loop  formation-keeping  controller  main¬ 
tains  the  required  constellation  in  spite  of  the  disturbance  forces.  An  essential  com¬ 
ponent  to  implement  this  controller  is  the  availability  of  reliable  communication  be¬ 
tween  the  micro-satellites  forming  the  constellation  [48] .  Because  of  high-bandwidth, 
power  efficiency,  and  small  weight,  free-space  optics  is  an  attractive  means  for  com¬ 
munication  between  the  micro-satellites  [48].  Moreover,  a  free-space  optical  link 
can  be  used  simultaneously  for  the  purpose  of  range  and  attitude  measurement  [57], 
which  is  another  essential  component  for  formation  flying  [48]. 

Deployment  of  the  optical  links  in  a  formation  flying  system  raises  several 
questions  and  design  concerns,  while  at  the  same  time,  it  opens  some  windows  of 
opportunity.  A  fundamental  question  in  this  regard  is  whether  the  communication 
subsystem  (including  the  optical  alignment  components)  and  the  formation-keeping 
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controller  can  be  designed  independently.  For  two  reasons,  the  answer  to  this  ques¬ 
tion  seems  to  be  negative.  First,  the  limitations  of  this  communication  scheme  (e.  g., 
communication  lost  due  to  sudden  lose  of  alignment)  require  special  considerations 
in  the  design  of  the  formation-keeping  controller.  The  second  reason  is  that  a  certain 
formation  flying  mission  might  impose  constraints  on  the  optical  alignment  system, 
or  it  might  support  a  capability  for  designing  a  more  efficient  and  less  expensive 
alignment  system. 

In  a  formation  flying  system,  the  communication  between  the  members  can  be 
established  through  an  optical  network.  This  provides  with  extra  flexibility  for  the 
optical  alignment  systems,  since  for  any  specific  formation  state,  the  network  can 
be  reconfigured  to  achieve  the  best  alignment  performance. 

A  stochastic  approach  to  the  problems  above  is  the  most  convincing  one  [4], 
Finding  solutions  to  these  problems  is  a  logical  continuation  of  this  research,  in  light 
of  the  well-established  models  and  methods  we  developed  in  the  present  work. 

7.3  Possible  Applications  in  the  Study  of  Nervous  Systems 

The  response  of  an  animal’s  sensory  nerve  to  a  physical  stimulus  is  a  sequence 
of  electrical  pulses  called  spike  or  action  potential  [58,  59].  By  means  of  sensory 
nerves,  the  information  carried  by  the  stimuli  is  represented  (coded)  in  terms  of 
the  temporal  pattern  of  the  sequences  of  spikes  [58].  An  important  question  in 
the  study  of  nervous  systems  is  how  the  information  is  represented  through  this 
temporal  pattern.  This  question  leads  to  the  associated  decoding  problem:  how  to 
reconstruct  the  stimulus  from  the  spikes. 
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The  nature  of  the  problem  above  suggests  a  stochastic  approach  to  tackle 
the  problem.  A  complete  description  of  the  problem  consists  of  a  stochastic  model 
for  the  “stimulus-spike”  relationship,  a  stochastic  model  to  describe  the  temporal 
evolution  of  the  stimulus,  and  formulating  the  decoding  problem  in  terms  of  an 
estimation  problem.  For  example,  the  authors  of  [60]  characterized  place  cells1  in 
terms  of  an  array  of  doubly  stochastic  Poisson  processes  (place  cell-spike  frequency 
representation).  Then,  they  formulated  the  decoding  problem  as  estimating  the  state 
of  a  continuous-time  state-space  model  which  modulates  the  rates  of  the  Poisson 
processes.  A  discrete-time  version  of  this  model  has  been  used  in  [61],  in  order  to 
estimate  the  position  of  a  rat,  based  on  the  data  collected  from  its  hippocampal 
place  cells.  Also,  adaptive  filtering  techniques  have  been  applied  to  the  model  in 
order  to  analyze  the  plasticity2  of  neural  receptive  fields  [62], 

These  examples  demonstrate  close  similarity  to  the  stochastic  models  we  worked 
with  during  the  present  research.  Thus,  the  body  of  techniques  we  developed  here 
might  be  useful  in  study  of  nervous  systems. 


1Place  cells  are  sensory  nerves  which  fire  when  the  animal  is  close  to  a  particular  location. 

2  This  means  that  the  response  of  neurons  to  stimuli  change  with  experience. 
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